|
GIS Guide to Good Practice |
|
3.10 Issues to consider when structuring and organising a flexible attribute database
When attempting to structure and organise a flexible attribute database the following factors are of critical importance. In the following section each of these issues will be looked at in turn.
3.10.1 Naming conventions
Try to keep field names descriptive rather than cryptic. The crib
sheet for decoding cryptic names may easily get lost, and your
fields are likely to be too numerous for you to remember
their contents easily. Key fields are the most important fields in your attribute database
and are the fields that will be used for primary searching of
the database and/or for linking tables within your database.
It is essential that the same data definitions are used for all
instances of the key field in your database and that the same
codes are used in each. Take care with character field definitions. Most databases require
character data to be stored in a fixed length form and so, inevitably,
this means that every record must contain enough space for the
largest expected, even where this is not required for the vast
majority of records. As an example, there is no point in defining
a location name field large enough to store the longest name in
Monmouthshire, Llanvihangel-Ystern-Llewern, if the name
Monmouth happens to be the longest in the data set! Store grid references in an appropriate notation for easy transition
to a GIS or conversion to an appropriate map projection (e.g.
British National Grid references are commonly held as alphanumeric
attributes in a single column which require some processing before
points can be mapped on a GIS, a more appropriate form of notation
would be in two numerical columns e.g. 456344 / 267833
for SP 5634467833). Get in the habit of ensuring that the data entered into any field
in your attribute database makes sense. For example, check that
you haven't typed the letter 'O' instead of '0' (zero). Another
tip is to check that numeric values are within range - for example
that a slip of the old typing fingers hasn't moved your Norman
site from 1066 to 2066. It's often helpful to have someone else
validate data that you have entered as typos are more easily detected
by a fresh pair of eyes. If your data input tools allow you to define
validation checks, use them, but remember that - like spelling checkers -
they cannot catch all possible input errors. It is best to use numeric field types rather than text fields
if you have numeric data. This can have three benefits. First,
confusing characters -- such as that familiar O (letter) instead
of 0 (zero) problem -- cannot be stored in the wrong field type.
Second, in many computer-based databases numeric information is
stored more efficiently than text and occupies less space. This
means that your GIS data set will be leaner and meaner. Third,
when data is held in numeric form the data can more readily be manipulated
with the arithmetic operators. If you are using numeric data,
also ensure that you use the most appropriate numeric type - integer
or floating point. Integer types are used for storing whole numbers
and floating point numbers are used for storing numbers which
have, or may have, a fractional part.
Where possible the fields should be set up to use dictionaries
or thesauri to ensure that typing errors are kept to a minimum
and restricted to free text fields, and that terms used to describe
real world objects are used accurately and consistently. Adhere
to established appropriate project data standards (e.g. the RCHME/English
Heritage Urban Archaeological Database Data Standards). If no
project standards exist, adhere to the data standards of the digital
archive for your data whether that be the SMR
or the ADS.
Remember that your data will need a home if it is to remain a
useful and accessible resource in the future, and it is your responsibility
to ensure its compatibility with other data sets of a similar
spatial or temporal resolution.
These indicate the level of certainty that is associated with
an entry in the attribute database. For example, your certainty
that the location, identification, dating, etc. of the object
is accurate. It is very good practice to maintain this information
at all times.
Try to ensure that the codes used to record your attribute data
are consistent. Ensuring consistency is especially difficult when
data entry is performed by more than one person, or if data entry
is carried out incrementally over time. The use of thesauri and
documentation standards can be helpful in ensuring consistency within
your database and between your database and others. See Appendix 2 for
a list of standards that may be appropriate.
Calendar dates should be recorded in a date field-type rather
than character field-type to avoid the loss of crucial data when
transferring into different software packages. Be aware some software
will not prompt you if you are about to lose data due to incompatible
field types.
The most important thing of all is to document the way you have
organised your database and entered information into it! All of
Section 5
is devoted to this topic. It is essential that source-specific information is recorded
as and when data is generated, as this task becomes increasingly
difficult retrospectively. Where did the source data originate
from, what was the scale at which it was prepared, if based on
others' work where can this be found, and what are the copyright
restrictions involved in its use by a third party? What levels
of accuracy were accepted and what errors were recorded during
digitization etc? What data standards were adhered to (dated
if possible, as revisions will occur) and what naming conventions
have been adopted. |
The right of Mark Gillings, Peter Halls, Gary Lock, Paul Miller, Greg Phillips, Nick Ryan, David Wheatley, and Alicia Wise to be identified as the Authors of this Work has been asserted by them in accordance with the Copyright, Designs and Patents Act 1988. All material supplied via the Arts and Humanities Data Service is protected by copyright, and duplication or sale of all or part of any of it is not permitted, except that material may be duplicated by you for your personal research use or educational purposes in electronic or print form. Permission for any other use must be obtained from the Arts and Humanities Data Service(info@ahds.ac.uk). Electronic or print copies may not be offered, whether for sale or otherwise, to any third party.
|