The Format Rules
This section sets out the Rules that define the AGS Format. The Rules have been the subject of much discussion and these notes seek to explain the overall framework within which they are formulated.
A fundamental consideration has been that potential users of the Format should be able to use standard software tools to produce the data files. The spreadsheet is the most basic tool for the task, readily allowing data “tables” to be created and ASCII data files to be produced. Likewise, data files produced according to the Rules can be read directly by spreadsheet software. Although the Rules make it possible for users to manipulate AGS data files using spreadsheets alone, it is to be expected that more specific software will become available to automate the reading and writing of the data files. These software systems may range from simple data entry and edit programs through to complete database systems with data translation modules for AGS files.
Another fundamental point to bear in mind when assessing these Rules is that the resulting data file has been designed to be easy for the computer to read. The data files do not replace the printed reports which they accompany. However the layout does allow data items to be readily identified should the need arise.
The following notes explain some points of detail in the Rules.
ASCII ‘CSV’ Files
The Rules define ASCII data files of a type commonly referred to as CSV (Comma Separated Value). This type of file is readily produced and read by many spreadsheet (and other) systems. The data items are separated by commas and are surrounded by quotes (“).
Numeric and Character Data – Delimiters
The Rules permit any data field to contain text, since this allows characters in numeric fields and caters for those countries which use the comma in place of the decimal point. For these reasons ALL data fields must be surrounded by quotes. When inputting data to a spreadsheet, prefix all numeric entries with a quote. In this way all the data fields will be stored as text and CSV output will produce quotes around all items.
Note that most spreadsheet and database systems provide a VALUE( ) function (or similar) to convert text data to numeric data. This function can be used where calculations need to be carried out on data imported from AGS files.
Key, Common & Additional Fields
The data fields defined by the Format fall into one of three categories:
KEY fields must be included every time a data group appears in a data file.
COMMON fields are those fields that are expected to be used in most data files.
ADDITIONAL fields are those fields that are expected to be used less frequently.
The Rules define a scheme for producing continuation lines where there are long data fields. Although the scheme may seem complex at first sight, it is the system automatically produced by spreadsheets if the long data items are continued on additional rows IN THE SAME DATA COLUMN. Similarly, these data files will read into spreadsheets and preserve the long data items in their correct column order, for any length of data. It should be noted that spreadsheets impose a finite limit (eg. 240) on the number of characters within a single data field. The special <CONT> symbol must appear in the HOLE_ID field, and thus <CONT> should never be used as a HOLE_ID.
Details of the default units to be used for each of the Data Fields are given in Appendices 2 and 3 of the Electronic Transfer of Geotechnical Data from Ground Investigations. These are the preferred units for each of the data dictionary definitions and should be used wherever possible. They will either be the appropriate SI units or the unit defined by the particular British Standard relating to that specific item of data. It is recognised that situations will occur where neither the SI unit nor the British Standard unit are being used. Provision is made for these non-standard data units to be declared in the data transfer file.
The following rules must be used when creating a data interchange file.
The data file shall be entirely composed of ASCII characters. The extended character set may be used.
Each data file shall contain one or more data GROUPs. Each data GROUP contains related data.
Within each GROUP, data items are contained in data FIELDs. Each data FIELD contains a single data VARIABLE. Each line of the data interchange file can contain several data FIELDs.
The order of data FIELDs on each line within a GROUP is defined at the head of each GROUP by a set of data HEADINGs.
Data HEADINGs and GROUP names must be taken from the approved Data Dictionary.
The data HEADINGs fall into one of 3 categories: KEY / COMMON / ADDITIONAL
KEY fields must appear in each GROUP, but may contain null data (see Rule 15). These are necessary to uniquely define the data. *HOLE_ID should always be the first field except in the “**PROJ” GROUP, where “*PROJ_ID” should be the first field.
All data VARIABLEs can contain any alphanumeric data (ie. both text and numbers). Numerical data should be in numerals. Eg. 10 not TEN. (See also Note 2).
Note that all numerals must be presented as a text field.
Data GROUP names, data field HEADINGs and data VARIABLEs must be enclosed in double quotes (“…”). eg. for inches or seconds (“) must not appear as part of the data variable.
The data field HEADINGs and data VARIABLEs on each line of the data file should be separated by a comma (,).
Each GROUP name shall be preceded by 2 asterisks (**). Eg.
HEADINGs shall be preceded by 1 asterisk (*). Eg.
No line of data HEADINGs or data VARIABLEs shall exceed 240 characters. The character count should include delimiting quotes and commas. Eg.
“*HOLE_ID”,”*HOLE_NATE” = 23 characters
A line of data HEADINGs exceeding 240 characters can be continued on immediately following lines. A data HEADING must not itself be split between lines. A comma must be placed at the end of a HEADINGs line that is to be continued. Eg.
A line of data VARIABLEs exceeding 240 characters must be continued on immediately following lines. Data VARIABLEs can be split between lines. A VARIABLE continuation line shall begin with the special name “<CONT>” in place of the first data VARIABLE. (PROJ_ID or HOLE_ID). The continued data is then placed in the correct field order by inserting the appropriate number of Null data VARIABLEs before it. Note that each line of data in a GROUP should contain the same number of VARIABLEs. Eg.
“501”,”1.2″,”2.4″,”Very stiff brown CLAY with”,””
“<CONT>”,””,””,”extremely closely spaced fissures”,”CLAY”
(See also Note 4)
Null data VARIABLEs must be included as 2 consecutive double quotes (“”). Eg.
Data GROUPs can be repeated within a file with different HEADINGs.
(See also Note 2)
The number of data HEADINGs per GROUP shall not exceed 60.
If non-standard units are to be used for any data VARIABLES in a group then a UNITS line must be placed immediately after the HEADINGS line. An entry must be made for each data VARIABLE. Null entries (“”) must be used for data VARIABLES that are in standard units. The non-standard units must be entered between ” “. The line must begin with the special name <UNITS> in place of the first data variable. (PROJ_ID or HOLE_ID). Eg.
(See also Note 5)
Each data file shall contain the “**PROJ” GROUP.