File naming convention
GHS36_<PROJECT>_<CHAMBER>_<VARIABLE COLLECTION CODE>_<DATA PROCESSING>_<DATE or DATERANGE>_[VERSION].extension
The individual elements that make up the naming convention are described in detail below. First of all however it is important that you consider the following points
- All terms in the filename should be upper case apart from the file ending which should be lower case
- An underscore is used to separate the different ‘elements’ of the filename. An underscore should therefore not be used elsewhere within the filename
- Do not use illegal characters within a filename, e.g. / ? < > : * | ” as well as a ‘space’.
- The file naming convention has been constructed to balance the amount of useful detail in the name, the need to ensure file uniqueness, as well as to keep the filename as short as possible.
- GHS36: Official HIE facility code to represent data sourced from the S36 Glasshouse facility (all data from the S36 Glasshouse facility should be prefixed by use of the code ‘GHS36’).
- PROJECT: Which project this data file is associated with. The project short name code should be used in the filename. In the event that the data was from an automated sensor, ‘AUTO’ is used for the project code
- CHAMBER: Which chamber(s) data has been sourced from. This can include data sourced from a single chamber, from all chambers, or from a subset of chambers. Rules on how to label each can be found in the ‘Chamber’ section below.
- VARIABLE COLLECTION CODE: A collection code that represents a particular grouping of variables contained within a file. A list of variables contained within different variable collection codes can be found in the ‘Collection Codes’ section below.
- DATA PROCESSING: The level of post-processing operated on the data. The definitions of the different levels of post-processing can be found below.
- DATE or DATERANGE: The date range which a dataset covers (for automated timeseries data for example) or the single date on which a sample, for example, was taken. Note that dates are in the YYYYMMDD format (with a hyphen used to split the start and end date of a date range).
- VERSION: An optional version number of the form _Vx. This can be utilised when a new version of a file already existing within HIEv has been updated/corrected (maintaining the same data processing level).
- extension: The format of the data file, e.g. .csv, .dat (for toa5 data) etc.
Examples
- GHS36_AUTO_C06_ENVVARS_R_20150601-20150730.dat
- Automated (raw) facility data from Chamber 6 within the GHS36 facility in the period 1st June 2015 to 30th July 2015 and covering variables included in the ‘ENVVARS’ category of variables.
- GHS36_AUTO_C346_RAD_L1_20150527.dat
- Automated ‘RAD’ data for the 27th May 2015 within the GHS36 facility, and sourced from chambers 3, 4 and 6. The data has been subjected to level one cleaning and processing.
Chamber
‘Chamber’ specifies which of the 6 chambers has been a source location for the current data file. Data sourced from…
- a single chamber will use Cxx (where xx is a double digit from 01 to 06), e.g. C01 for chamber 1, C02 for chamber 2, etc.
- all chambers will use CA.
- a subset of chambers will use C followed by an ordered list of chambers, up to a maximum of 5 chambers (when all 6 chambers are used ‘CA’ should be used). So for example:
- C12 contains data sourced from chambers 1 and 2.
- C346 contains data sourced from chambers 3, 4 and 6.
- C23456 contains data sourced from all chambers apart from chamber 1.
Data Processing Codes
The meanings of the different data processing codes used in WTC filenames are as follows:
- R (Raw Data): Data that has been directly extracted from an instrument and that has not been subjected to any data cleaning or postprocessing
- L1 (Level One Data): Data that has been cleaned and processed, but in a cursory manner. Some erroneous data may be included.
- L2 (Level Two Data): Data that has been rigorously cleaned, processed, and checked for quality control.
- L3 (Level Three Data): Archive of published datasets.
Collection Codes
Please edit the table as you see fit! This is a quick draft.
Collection code | Description |
ENVVARS | This dataset is the raw data exported directly from the monitory program PlantVisor. The environmental variables recorded include Temperature Reading (°C), Humidity Reading (%) and Co2 Reading (ppm). Each variable is recorded in 5 minute intervals. The chamber number (01-06) and date (YYYYMMDD) is included in the filename. To view properly, the file will need to be converted from text to columns. |