Best Practice Recommendations for Data Management

Best practices for time series of high frequency sensor data and harmonized, static datasets are described below. Additional documentation is in development.

Time series of high frequency streaming sensor data:

  1. Get level 0 raw (level 0) data from platform
  2. Store level 0 raw as an ASCII file locally and back up in a different location where it will not get changed
  3. Archive level 0 (annually, monthly) - Optionally on DataONE, or another repository as a backup and long term archive mechanism
  4. Share level 0 through CUAHSI (creating standardized data ready to harmonize with other data)
  5. Process level 0 data: Automated QA/QC to remove egregious errors (e.g., using B3 or another software) and produce level 1 data
  6. Share level 1 data on CUAHSI and / or archive in DataONE
  7. Process level 1 data: Produce level 2 data by quality control using human review to confirm data
  8. Share level 2 data on CUAHSI and / or  archive in DataONE

Harmonized, static datasets published or to be published:

  1. Publish or access published raw data using DataONE
  2. Carry out analysis
  3. Publish final dataset on DataONE with scripts or detailed methods description

Data level definitions:
    level 0 - raw data
    level 1 - automated QA/QC, large obvious errors removed
    level 2 - human intervention QA/QC
    level 3 - modeled or otherwise gap filled (not necessary to publically archive since everyone models differently)
    level 4 - aggregated, summarized data

Theme by Danetsoft and Danang Probo Sayekti inspired by Maksimer