FANDOM


Checklist for Building a Data / Feature / Event CatalogEdit

Latest published version available at: http://sdac.virtualsolar.org/catalogs/catalog_checklist


Glossary: (as used in this document) :Edit

  • Catalog : a list of metadata for items of interest
  • Record  : metadata about a single item of interest
  • Field  : an attribute of the items being cataloged
  • Value  : a specific string / number / etc. of an attribute in a given record
  • File  : the transfer format used for the catalog [may be generated dynamically on demand]

Have I ...Edit

... described the catalog?Edit

___ chosen a name that clearly and unambiguously describes my catalog?
___ described how my catalog is different from similar catalogs, and given reference to those other catalogs?
___ declared when the catalog was last updated?
[or checked to determine no update was necessary]
___ provided an authoritative URL where people can check for updates to this catalog?
___ mentioned the frequency of possible updates?
[or that there are no planned updates]
___ provided contact information in case someone had questions about the catalog?
[or wanted to ask you to co-author their paper, etc.]
___ explained any changes in methodology / maintainers over time?


... described the data used as the basis for the catalog ?Edit

___ described specifically which detector(s) in which operating mode(s)
[eg, if using LASCO/C2 H-alpha images, don't just say 'LASCO']
___ explained the processing that was applied to the data for the analysis?
[did you calibrate each image, or use difference images of the low level products?; did you use 5min averages or the full resolution time series?; 2x2 binned, or the full res. images?]
___ mentioned where I obtained the data from?
[in case of processing differences between archives]
___ mentioned any gaps in that data that might affect my ability to detect an event / feature?
___ specified the temporal extent of the data used?
[note -- especially important for sparse events that may only occur a few times per year; ie, even though the first event is on June 12, I analyzed data back to January 1]
___ mentioned if the data was subseted before analysis?
[eg, only looked at the first image each hour]


... described the records in my catalog?Edit

___ described what the records represent or describe?
___ described what my qualifications were for including a record (event/feature) in my catalog?
___ specified a primary key that uniquely describes each record?
[often, it's start time, unless there is a risk of two events starting at the same time ... but time _is_not_ a good value, as further analysis may change the record's identifier, creating confusion if this is the same event]
[primary keys may be composite (multiple fields), eg, (day & active region #) in NOAA SRS section I]


... described the fields within the records?Edit

___ described each of the individual fields in the record:
[may be done on a per-column basis depending on catalog format]
___ labeled the field?
___ explained the field in language that would be unambigously explain how it was measured / calculated by someone from my specific (sub) discipline?
___ explained what the general concept of the field is in language that would be understood by the greater physics community?
___ provided a machine-readable description of the field?
[note : requires us having an controled vacabulary first ... may not be possible right now; see UCD+, SPASE, PACS, SWEET, SESDI]
___ labeled the units for the field, or defined which field contains the units for this field?
[or unitless, if a ratio]
___ explained the precision of the values in the field?
___ explained any markings / other fields describing abnormal precision for values in this field?
___ explained any information conveyed with formatting?
[eg, special colors used in MS Excel or HTML tables, italicized fonts]
[note: only using color may violate Section 508]
___ explained the possible extents of the field if applicable?
[ie, min/max for numeric, max length for strings, possible enumerations]
___ explained the reference scale or coordinate system?
[eg, if time: UTC? spacecraft time? adjusted to earth/sun time?]
[if pressure: absolute pressure, or gauge pressure?]
[if file paths, given the URL prefix]
___ explained the significance of the field being empty?
[or a value used to signify the field being unknown]
___ explained the data type used to store values in the field:
Special cases:
Boolean : How true, false and null values are recorded:
[eg, 'T' is true, all others are false]
[1 is true, -1 is false, 0 is unknown]
Enumerations and Flags:
What the possible values are and what they signify.
If there is a natural sorting order to the values.
Foreign Keys:
What table / catalog is this a foreign key to?
Dates:
How is the date formatted?
URLs (or URL parts):
Explained what the URL links to.
___ ensured that values within a given field are consistent?
(if free text, consistent wording for notes; if numeric, measured / derived the same way?)

... planned for use of the catalog?Edit

___ provided text to use for attribution?
[eg, 'this catalog was funded by (x)', etc.]
___ used a well documented, easily accessed format to store the catalog?
___ stored the documentation for the catalog in a well documented, easily accessed and used format?
___ chosen a format that can be freely used?
[ie, doesn't require specific proprietary software (eg. IDL) to be purchased]
[note: you can distribute the catalog in more than one form]
___ chosen a format that is easily used and available?
[eg, FITS and CDF are not used by all science disciplines ... XML (VOTable) or CSV may be better; PDF is difficult to extract back to tables]
___ documented how to extract the individual fields from the file?
[eg, if fixed-width ASCII, given the columns for each field]
___ included header fields, if appropriate?
___ provided linkages to where to find the documentation from within the file?
___ used document conventions that translate easily between formats?
[eg, don't merge cells in MS Excel & HTML tables]

Ad blocker interference detected!


Wikia is a free-to-use site that makes money from advertising. We have a modified experience for viewers using ad blockers

Wikia is not accessible if you’ve made further modifications. Remove the custom ad blocker rule(s) and the page will load as expected.