The Format Rules
Introduction
This section sets out the Rules that define the AGS Format. The Rules
have been the subject of much discussion and these notes seek to explain
the overall framework within which they are formulated.
A fundamental consideration has been that potential users of the Format
should be able to use standard software tools to produce the data files.
The spreadsheet is the most basic tool for the task, readily allowing
data "tables" to be created and ASCII data files to be
produced. Likewise, data files produced according to the Rules can be
read directly by spreadsheet software. Although the Rules make it
possible for users to manipulate AGS data files using spreadsheets
alone, it is to be expected that more specific software will become
available to automate the reading and writing of the data files. These software
systems may range from simple data entry and edit programs through
to complete database systems with data translation modules for AGS
files.
Another fundamental point to bear in mind when assessing these Rules is
that the resulting
data file has been designed to be easy for the computer to read. The
data files do not replace the printed reports which they accompany.
However the layout does allow data items to be readily identified should
the need arise.
Notes
The following notes explain some points of detail in the Rules.
Note 1
ASCII 'CSV' Files
The Rules define ASCII data files of a type commonly referred to as CSV
(Comma Separated Value). This type of file is readily produced and read
by many spreadsheet (and other) systems. The data items are separated by
commas and are surrounded by quotes (").
Note 2
Numeric and Character Data - Delimiters
The Rules permit any data field to contain text, since this allows
characters in numeric fields and caters for those countries which use
the comma in place of the decimal point. For these reasons ALL data
fields must be surrounded by quotes. When inputting data to a
spreadsheet, prefix all numeric entries with a quote. In this way all
the data fields will be stored as text and CSV output will produce
quotes around all items.
Note that most spreadsheet and database systems provide a VALUE( )
function (or similar) to convert text data to numeric data. This
function can be used where calculations need to be carried out on data
imported from AGS files.
Note 3
Key, Common & Additional Fields
The data fields defined by the Format fall into one of three categories:
KEY fields must be included every time a data group appears in a data
file.
COMMON fields are those fields that are expected to be used in most data
files.
ADDITIONAL fields are those fields that are expected to be used less
frequently.
Note 4
Continuation Lines
The Rules define a scheme for producing continuation lines where there
are long data fields. Although the scheme may seem complex at first
sight, it is the system automatically produced by spreadsheets if the
long data items are continued on additional rows IN THE SAME DATA
COLUMN. Similarly, these data files will read into spreadsheets and
preserve the long data items in their correct column order, for any
length of data. It should be noted that spreadsheets impose a finite
limit (eg. 240) on the number of characters within a single data
field. The special <CONT> symbol must appear in the HOLE_ID field,
and thus <CONT> should never be used as a HOLE_ID.
Note 5
Units
Details of the default units to be used for each of the Data Fields are
given in Appendices 2 and 3 of the Electronic
Transfer of Geotechnical Data from Ground Investigations. These are
the preferred units for each of the data dictionary definitions and
should be used wherever possible. They will either be the appropriate SI
units or the unit defined by the particular British Standard relating to
that specific item of data. It is recognised that situations will occur
where neither the SI unit nor the British Standard unit are being used.
Provision is made for these non-standard data units to be declared in
the data transfer file.
Rules
for creating data files
The following rules must be used when creating a data interchange
file.
Rule 1
The data file shall be entirely composed of ASCII characters. The
extended character set may be used.
Rule 2
Each data file shall contain one or more data GROUPs. Each data GROUP
contains related data.
Rule 3
Within each GROUP, data items are contained in data FIELDs. Each data
FIELD contains a single data VARIABLE. Each line of the data interchange
file can contain several data FIELDs.
Rule 4
The order of data FIELDs on each line within a GROUP is defined at the
head of each GROUP by a set of data HEADINGs.
Rule 5
Data HEADINGs and GROUP names must be taken from the approved Data
Dictionary.
Rule 6
The data HEADINGs fall into one of 3 categories: KEY / COMMON /
ADDITIONAL
KEY fields must appear in each GROUP, but may contain null data (see
Rule 15). These are necessary to uniquely define the data. *HOLE_ID
should always be the first field except in the "**PROJ" GROUP,
where "*PROJ_ID" should be the first field.
Rule 7
All data VARIABLEs can contain any alphanumeric data (ie. both text and
numbers). Numerical data should be in numerals. Eg. 10 not TEN. (See
also Note 2).
Note that all numerals must be presented as a text field.
Rule 8
Data GROUP names, data field HEADINGs and data VARIABLEs must be
enclosed in double quotes ("..."). eg. for inches or seconds
(") must not appear as part of the data variable.
Rule 9
The data field HEADINGs and data VARIABLEs on each line of the data file
should be separated by a comma (,).
Rule 10
Each GROUP name shall be preceded by 2 asterisks (**). Eg.
"**HOLE"
Rule 11
HEADINGs shall be preceded by 1 asterisk (*). Eg.
"*HOLE_ID"
Rule 12
No line of data HEADINGs or data VARIABLEs shall exceed 240 characters.
The character count should include delimiting quotes and commas. Eg.
"*HOLE_ID","*HOLE_NATE" = 23
characters
Rule 13
A line of data HEADINGs exceeding 240 characters can be continued on
immediately following lines. A data HEADING must not itself be split
between lines. A comma must be placed at the end of a HEADINGs line that
is to be continued. Eg.
"*HOLE_ID","*SAMP_TOP","*SAMP_REF","*SPEC_REF",
"*CLSS_LL","*CLSS_PL","*CLSS_BDEN"
Rule 14
A line of data VARIABLEs exceeding 240 characters must be continued on
immediately following lines. Data VARIABLEs can be split between lines.
A VARIABLE continuation line shall begin with the special name
"<CONT>" in place of the first data VARIABLE. (PROJ_ID
or HOLE_ID). The continued data is then placed in the correct field
order by inserting the appropriate number of Null data VARIABLEs before
it. Note that each line of data in a GROUP should contain the same
number of VARIABLEs. Eg.
"**GEOL"
"*HOLE_ID","*GEOL_TOP","*GEOL_BASE","*GEOL_DESC","*GEOL_LEG"
"501","1.2","2.4","Very stiff brown
CLAY with",""
"<CONT>","","","extremely
closely spaced fissures","CLAY"
(See also Note 4)
Rule 15
Null data VARIABLEs must be included as 2 consecutive double quotes
(""). Eg.
,"",
Rule 16
Data GROUPs can be repeated within a file with different HEADINGs.
(See also Note 2)
Rule 17
The number of data HEADINGs per GROUP shall not exceed 60.
Rule 18
If non-standard units are to be used for any data VARIABLES in a group
then a UNITS line must be placed immediately after the HEADINGS line. An
entry must be made for each data VARIABLE. Null entries ("")
must be used for data VARIABLES that are in standard units. The
non-standard units must be entered between " ". The line must
begin with the special name <UNITS> in place of the first data
variable. (PROJ_ID or HOLE_ID). Eg.
"**GEOL"
"*HOLE_ID","*GEOL_TOP","*GEOL_BASE","*GEOL_DESC"
"<UNITS>","FEET","FATHOMS",""
(See also Note 5)
Rule 19
Each data file shall contain the "**PROJ" GROUP.
Go to top of page
|