File naming convention
FACE_<PROJECT>_<RING>_<VARIABLE COLLECTION CODE>_<DATA PROCESSING>_<DATE or DATERANGE>_[VERSION].extension
The individual elements that make up the naming convention are described in detail below. First of all however it is important that you consider the following points
- All terms in the filename should be upper case apart from the file ending which should be lower case
- An underscore is used to separate the different ‘elements’ of the filename. An underscore should therefore not be used elsewhere within the filename
- Do not use illegal characters within a filename, e.g. / ? < > \ : * | ” as well as a ‘space’.
- The file naming convention has been constructed to balance the amount of useful detail in the name, the need to ensure file uniqueness, as well as to keep the filename as short as possible.
- FACE: Official HIE facility code to represent data sourced from the EucFACE facility (all data from the eucFACE facility should be prefixed by use of the code ‘FACE’).
- PROJECT: Which of the official EucFACE projects this data file is associated with. The project code should be used in the filename (capitalized P followed by a 4 digit project number, e.g. P0003, P0012, etc). In the event that the data was from an automated sensor, ‘AUTO’ is used for the project code
- RING: Which ring(s) data has been sourced from. This can include data sourced from a single EucFACE ring, from all rings, or from a subset of rings. It also allows the file to refer to EucFACE data collected on a non-ring site. Rules on how to label each can be found in the ‘Ring’ section below.
- VARIABLE COLLECTION CODE: A collection code that represents a particular grouping of variables contained within a file. A list of variables contained within different variable collection codes can be found on the associated EucFACE ‘Variable Collection Codes’ page.
- DATA PROCESSING: The level of post-processing operated on the data. The definitions of the different levels of post-processing can be found below.
- DATE or DATERANGE: The date range which a dataset covers (for automated timeseries data for example) or the single date on which a sample, for example, was taken. Note that dates are in the YYYYMMDD format (with a hyphen used to split the start and end date of a date range).
- VERSION: An optional version number of the form _Vx. This can be utilised when a new version of a file already existing within HIEv has been updated/corrected (maintaining the same data processing level).
- extension: The format of the data file, e.g. .csv, .dat (for toa5 data) etc.
Examples
- FACE_AUTO_R3_FLUX_R_20130630.dat
- Automated (raw) facility data from Ring 3 within the EucFACE facility in the ‘chunking’ period up to 30th June 2013 and covering variables included in the ‘FLUX’ category of variables.
- FACE_P0005_R346_RAD_L1_20130527.dat
- ‘RAD’ data for the 27th May 2013, created as part of project P0005 within the EucFACE facility, and sourced from rings 3, 4 and 6. The data has been subjected to level one cleaning and processing.
- FACE_P0005_R346_RAD_L1_20130527_V2.dat
- Corrected version of the above file (due to an error in the cleaning script).
Ring
‘Ring’ specifies which of the 6 EucFACE plots delineated by the outer array of vent pipes has been a source location for the current data file. Data sourced from…
- a single ring will use Rx (where x is a single digit from 1 to 6), e.g. R1 for ring 1, R2 for ring 2, etc.
- all rings will use RA.
- a subset of rings will use R followed by an ordered list of rings, up to a maximum of 5 rings (when all 6 rings are used ‘RA’ should be used). So for example:
- R12 contains data sourced from rings 1 and 2.
- R346 contains data sourced from rings 3, 4 and 6.
- R23456 contains data sourced from all rings apart from ring 1.
- a EucFACE sampling location outside of the 6 rings will use R0 (R followed by a zero). A description of the actual sampling location should be added in the metadata in this scenario.
Data Processing Codes
The meanings of the different data processing codes used in WTC filenames are as follows:
- R (Raw Data): Data that has been directly extracted from an instrument and that has not been subjected to any data cleaning or postprocessing
- L1 (Level One Data): Data that has been cleaned and processed, but in a cursory manner. Some erroneous data may be included.
- L2 (Level Two Data): Data that has been rigorously cleaned, processed, and checked for quality control.
- L3 (Level Three Data): Archive of published datasets.