USGS - science for a changing world

USGS Coastal and Marine Geology Program

Documentation of the U.S. Geological Survey Oceanographic Time-Series Measurement Database

image of next button.Go to online Time-Series Database

Network Common Data Format (NetCDF) Storage

The data-storage format employed for this database is Unidata's NetCDF. NetCDF is a general, self-documenting, machine-transportable data format created and supported by University Center for Atmospheric Research (UCAR) (http://www.unidata.ucar.edu/packages/netcdf/). NetCDF was chosen because it is widely used in the climate modeling community, is independent of hardware platform and operating system, and has a variety of helper applications already developed for data access and visualization. NetCDF files are typically made up of variables that contain measurements or computed values and attributes that describe the contents of the file or variables. We employ attribute and variable names from one of the few oceanographic data specifications available in the 1980s, which is called EPIC (Equatorial Pacific Information Collection) (http://www.pmel.noaa.gov/epic/). EPIC was developed by the NOAA Pacific Marine Environmental Laboratory (PMEL) to analyze, manage, and display in situ oceanographic data. By employing EPIC-compliant netCDF, this database may be used by researchers from different organizations without having to translate "foreign" data types into the local vernacular. Using a known vocabulary also enhances the discovery of these data by other computers and incorporation in larger data- aggregation sites. A list of the EPIC keys that may occur in our data is provided in appendix 3, but a single file will only contain a subset of these variables.

One of the advantages of employing netCDF format is that the metadata are stored with the data. A typical netCDF file in this database (.cdf and .nc suffixes) will have global attributes that describe what, where, when and how the data was collected. Global attributes apply to all the variables in the file, while each variable will have attributes that apply to the contents of that specific variable. The mooring number, data start date and time, data end date and time, position, instrument type, and sample rate are all metadata stored in the global attributes. Attributes and their possible values and usage are discussed in the following sections.

Coordinate variables are those used to describe the dimensions of the measurement variables. The files in the USGS database of oceanographic time- series measurements are typically four dimensional, where time, depth, latitude, and longitude are the dimensions (and coordinate variable names). The time dimension corresponds to the number of samples in the file. Sample measurement time (see Data Processing) is computed from coordinate variables named "time" and "time2". Depth may have one or more values (if a vertical current profile was measured at 14 heights above the bed by an ADCP, the depth dimension would have 14 values). Latitude and longitude have single values because our observations are from static platforms, but were defined as dimensions to preserve the option of employing drifting instruments that have time-varying position. Coordinate variables (time, time2, depth, lat, lon, freq, dir) may never have a FillValue_ attribute, because they cannot have gaps.

The netCDF file also contains data variables (the actual measurements) named using EPIC conventions. For example, if the variable contains seawater temperature measurements, it would be called T_28; if it contains east current velocity, it would be called u_1205. FillValue_ is used in data variables to indicate where data were unrecoverable or missing; we set it to 1e35. Attributes associated with each variable describe the units (for example, degrees Celcius, centimeters per second), sensor height on the tripod, data maxima and minima, and the sensor model and serial number that go with the data.

Global Attributes

This section describes usage of the generic global attribute fields in USGS/CMGP netCDF files. The metadata included are a combination of attributes defined in the EPIC conventions, with additional descriptors CMGP investigators find useful. EPIC attributes are CAPITALIZED; the ones added by CMGP are not, or may be Of_Mixed_Case.

Table 3 shows the possible values of EPIC global attributes named INST_TYPE, DATA_TYPE, and DATA_SUB_TYPE that describe the sensor. These terms may be used by other software to determine how the data are treated, so consistency in terms is needed. Column 1 is the generic instrument name we use; columns 2 to 4 are the terms required by EPIC for the attribute names in the first row of each column. Other options may exist for some attributes; for instance, DATA_TYPE may be PROFILE for a CTD lowered from a ship, but because our CTD measurements are made at a single depth, by EPIC's rules, the DATA_TYPE must be TIME.

Table 3: Equatorial Pacific Information Collection (EPIC) attributes that depend on instrument type.

Generic name INST_TYPE DATA_TYPE DATA_SUB_TYPE
ADCP RD Inst. ADCP ADCP MOORED
ADCP Nortek Aquadopp ADCP MOORED
ADCP RD Inst. ADCP ADCP MOORED
waves RD Inst. ADCP WAVESPEC N/A
CT SeaBird SeaCAT TIME* N/A
CT SeaBird MicroCAT' TIME* N/A
CT BR-6999 TIME* N/A
ADV Sontek ADV TIME* N/A
PCADP Sontek PCADP ADCP MOORED
ABSS Aquatec Aquascat ABS ABS N/A

* for DATA_TYPE = TIME, no DATA_SUB_TYPE is required

The CMGP also includes many instrument-specific identification and configuration details that may help users reconstruct how the data were collected and processed. For instance ADCP data files typically have the following attributes (among others) that are added.
  • transform : earth
  • orientation : up
  • frequency : 300
  • pings_per_ensemble : 60

EPIC conventions also specify the attributes shown in table 4 that are present in all data files. These specify who did the work, why, how often, and other details of what is expected in the data.

Table 4: Equatorial Pacific Information Collection (EPIC) attribute names found in all data types.

Attribute Description Example value
PROJECT Long name of Research Proj (funding) 'USGS Coastal Marine Geology Program'
EXPERIMENT Identifier chosen for experiment 'BOSTON'
DESCRIPTION specific site identifier 'B BUOY'
MOORING numeric id of the mooring/instrument 7671 (use 4 digits)
DELTA_T sample interval 600 (always seconds)
WATER_DEPTH best version of water depth at site 60 (always meters)
VAR_FILL indicator of bad or missing data 1.0e35
VAR_DESC short list of variables in the file 'u:v:w:Werr:AGC:PGd:Tx:P'
DATA_CMNT provides additional information 'NO Pressure logged'
COMPOSITE number of pieces in a composite series 0 if not composite
FILL_FLAG were fill values inserted? 0 if no, 1 if yes
DRIFTER is the platform drifting? 0 if no, 1 if yes
POS_CONST is the position consistent? 0 if it doesn't move, 1 if not consistent
DEPTH_CONST does the depth change? 0 if consistent, 1 if not consistent
DATA_ORIGIN organization collecting the data 'USGS WHSC Sed. Trans. Group'
COORD_SYSTEM how are coordinates mapped? 'GEOGRAPHICAL'
CREATION_DATE USGS WHSC usage is that this is the last MODIFIED date, not the initial creation date '31-Jan-2005 13:24:00'
WATER_MASS description of water sampled normally unused

The attributes listed in table 5 are not required by EPIC but have been included in the more recently processed files to more accurately document the deployment details and processing steps. The Conventions attribute tells other programs what vocabulary was used in attribute and variable naming. It is similar to indicating "this page is written in Danish"-- it helps software interpret the information correctly.

Table 5: Additional attributes typically employed.

Attribute Description
Deployment_date date deployed
Recovery_date date recovered
latitude deployment latitude
longitude deployment longitude
magnetic_variation from NOAA web site for position and time
start_time time of first record in file
stop_time time of last record in file
SciPi scientist responsible for the data
history * all processing steps appended;
most recent thing done is first in list
Conventions PMEL/EPIC
serial_number Instrument or sensor serial number
inst_height Instrument or sensor HAB (m)
inst_depth Instrument or sensor depth (m)
inst_height note about accuracy
inst_depth_note note about accuracy

* The history attribute is the best place to look for experiment of sensor-specific actions that may have occurred during processing. If data were truncated, it will be indicated here. Other actions, including which programs were run, and the processing sequence are listed in the history attribute.

Variable Attributes

Each variable in the file has its own attributes to describe and quantify the contents. The descriptors in the left column of table 6 are found in most variables; the column on the right contains sample values. If the parameter has more than one dimension, the minimum and maximum may be vector quantities instead of scalars. The minimum and maximum of the data are of the same units as the data-- in the example in table 6, because the transducer temperature is 'degrees C', the maximum and minimum are as well. The sensor depth (water depth minus sensor height) is always meters. FillValue_ is the number that represents erroneous or missing data in a time series. Sometimes the software may display FillValue_ as 1.00000004091848e+035, but it is truly 1.0e+35. The valid_range attribute specifies the potential range of acceptable data.

Table 6: Attributes associated with each variable.

Attribute Example value
name 'Tx'
long_name 'ADCP Transducer Temp.'
generic_name 'temp'
units 'degrees C'
epic_code 1211
sensor_type 'RD Instruments ADCP'
sensor_depth 30.558629989624
serial_number 138
minimum 4.65000009536743
maximum 12.0299997329712
valid_range [-5 40]
FillValue_ 1.00000004091848e+035

Equatorial Pacific Information Collection (EPIC) Keywords

The tables in Appendix 4 list the EPIC code numbers and associated variable names that are found in this database. If the column for numeric code is blank, the name is one that didn't exist in EPIC that was needed to describe a type of measurement or computed property. If an * is present in the name, when more than one sensor of that type is present, the * is replaced by a number; that is Sed*_981 becomes Sed1_981 and Sed2_981.

image of next button.Return to Top   image of next button.Go to Next Topic
Skip Navigation

Accessibility FOIA Privacy Policies and Notices

U.S. Department of the Interior | U.S. Geological Survey | Coastal and Marine Geology

URL: cmgds.marine.usgs.gov/publications/of2007-1194v1/html/netcdf.html
Page Contact Information: CMGDS Team
Page Last Modified: Wednesday, 06-Dec-2017 13:17:06 EST