Best Practice Recommendations for Data Management

Best practices for time series of high frequency sensor data and harmonized, static datasets

EDI Logo


Additional documentation and help is available at the Environmental Data Initiative.


Time series of high frequency streaming sensor data:

  1. Get level 0 raw (level 0) data from platform
  2. Store level 0 raw as an ASCII file locally and back up in a different location where it will not get changed
  3. Archive level 0 (annually, monthly) with EDI.
  4. Share level 0 through CUAHSI (creating standardized data ready to harmonize with other data)
  5. Process level 0 data: Automated QA/QC to remove egregious errors (e.g., using B3 or another software) and produce level 1 data
  6. Share level 1 data on CUAHSI and / or archive in EDI
  7. Process level 1 data: Produce level 2 data by quality control using human review to confirm data
  8. Share level 2 data on CUAHSI and / or  archive in EDI

Harmonized, static datasets published or to be published:

  1. Publish or access published raw data using EDI
  2. Carry out analysis
  3. Publish final dataset on EDI with scripts or detailed methods description

Data level definitions:
    level 0 - raw data
    level 1 - automated QA/QC, large obvious errors removed
    level 2 - human intervention QA/QC
    level 3 - modeled or otherwise gap filled (not necessary to publically archive since everyone models differently)
    level 4 - aggregated, summarized data

Theme by Danetsoft and Danang Probo Sayekti inspired by Maksimer