File naming convention
ROS_<PROJECT>_<SHELTER>_<VARIABLE COLLECTION CODE>_<DATA PROCESSING>_<DATE or DATERANGE>_[VERSION].extension
The individual elements that make up the naming convention are described in detail below. First of all however it is important that you consider the following points
- All terms in the filename should be upper case apart from the file ending which should be lower case
- An underscore is used to separate the different ‘elements’ of the filename. An underscore should therefore not be used elsewhere within the filename
- Do not use illegal characters within a filename, e.g. / ? < > : * | ” as well as a ‘space’.
- The file naming convention has been constructed to balance the amount of useful detail in the name, the need to ensure file uniqueness, as well as to keep the filename as short as possible.
- ROS: Official HIE facility code to represent data sourced from the ROS facility (all data from the Rainout shelter facility should be prefixed by use of the code ‘ROS’).
- PROJECT: Which of the official ROS projects this data file is associated with. The project code should be used in the filename (capitalized P followed by a 4 digit project number, e.g. P0003, P0012, etc). In the event that the data was from an automated sensor, ‘AUTO’ is used for the project code
- SHELTER: Which shelter(s) data has been sourced from. This can include data sourced from a single shelter, from all shelters, or from a subset of shelters. It also allows the file to refer to shelter data collected on a non-shelter site. Rules on how to label each can be found in the ‘Shelter’ section below.
- VARIABLE COLLECTION CODE: A collection code that represents a particular grouping of variables contained within a file. A list of variables contained within different variable collection codes can be found in the ‘Variable Collection Codes’ section below.
- DATA PROCESSING: The level of post-processing operated on the data. The definitions of the different levels of post-processing can be found below.
- DATE or DATERANGE: The date range which a dataset covers (for automated timeseries data for example) or the single date on which a sample, for example, was taken. Note that dates are in the YYYYMMDD format (with a hyphen used to split the start and end date of a date range).
- VERSION: An optional version number of the form _Vx. This can be utilised when a new version of a file already existing within HIEv has been updated/corrected (maintaining the same data processing level).
- extension: The format of the data file, e.g. .csv, .dat (for toa5 data) etc.
Examples
- ROS_AUTO_S03_SOILVARS_R_20150630.dat
- Automated (raw) facility data from Shelter 3 within the ROS facility in the ‘chunking’ period up to 30th June 2015 and covering variables included in the ‘SOILVARS’ category of variables.
- ROS_P0105_SMULTI_AIRVARS_L1_20150527.dat
- ‘AIRVARS’ data for the 27th May 2015, created as part of project P0105 within the ROS facility, and sourced from multiple shelters. The data has been subjected to level one cleaning and processing
- ROS_P0105_SMULTI_AIRVARS_L1_20150527_V2.dat
- Corrected version of the above file (due to an error in the cleaning script).
Shelter
‘SHELTER’ specifies which of the 12 rainout shelters is the source location for the current data file. Data sourced from…
- a single ring will use Sxx (where xx is a double digit from 01 to 12), e.g. S01 for shelter 1, S11 for ring 11, etc.
- multiple or all shelters will use SMULTI (user would need to check the HIEv metadata for further info on what shelters exactly were used).
- a sampling location outside of the 12 shelters will use S00 (S followed by a double zero). A description of the actual sampling location should be added in the metadata in this scenario.
Data Processing Codes
The meanings of the different data processing codes used in ROS filenames are as follows:
- R (Raw Data): Data that has been directly extracted from an instrument and that has not been subjected to any data cleaning or postprocessing
- L1 (Level One Data): Data that has been cleaned and processed, but in a cursory manner. Some erroneous data may be included.
- L2 (Level Two Data): Data that has been rigorously cleaned, processed, and checked for quality control.
- L3 (Level Three Data): Archive of published datasets.
Collection codes
Please edit the table as you see fit! This is a quick draft.
Collection code | Description |
TREEMEAS | Periodic tree height and diameter measurements |
SPOTGAS | Leaf gas exchange campaigns, including spot measurements. |
ACIGAS | Leaf A-Ci curves. |
BIOMASS | All biomass or related measurements, including final harvest biomass, SLA measurements, leaf counting |
SOILS | Soil chemistry, soil respiration, etc. |
TDL | Any tuneable diode laser related measurements. |
MICROMET | Any data related to micro-meteorology related measurements, including PAR and LAI. |