MANAGE AND SHARE DATA


DATA STORAGE

The storage of any data within research projects should be based on two principles, both of which are key to the UKDA's own preservation policy:
  • digital storage media are inherently unreliable unless they are stored appropriately
  • all file formats and physical storage media will ultimately become obsolete

For most researchers these principles may appear to overstate the case. For the UKDA, however, where our business is the long-term preservation of digital data, these principles do not seem hyperbolic, and we base our recommendations to researchers on them.

Storage of data relating to research projects should be taken seriously from the outset to ensure that valuable data resources are safely stored during a research project and beyond if data are to be successfully preserved into the long term. Furthermore, non-digital aspects of data storage should also be looked into by those creating, storing and curating data.

Data/file formats

The UKDA produces a guide to acceptable data formats for preservation. These formats are highly recommended for general use in any data storage environment. However, it is not necessary to stick hard and fast to these formats in a non-archival environment. The UKDA recommends that, wherever possible, data should be stored in formats which meet long-term readability requirements. In general this means non-proprietary formats or formats based on open standards. However, some proprietary formats, like Microsoft's Rich Text Format and Excel, are widely used and likely to be accessible for a reasonable time after any version has become obsolete.

A file extension does not necessarily refer to the file type, so a file with the extension .doc was not necessarily created with MS Word software, nor will it necessarily open successfully in that software.

Documentation

Comprehensive and accurate documentation is essential for informed and accurate use (and re-use) of the data. See data documentation and metadata. Making digital versions of paper documentation in PDF/A format is recommended for long-term security.

Storage media

The accessibility of any data is dependent on the quality of the storage media on which they are stored and the availability of the relevant data-reading equipment. An Amstrad floppy disc may still work perfectly 20 years after it was made, but the paucity of working machines means that the data on this disc may not be easily recoverable.

All optical media (CDs and DVDs) are subject to physical degradation, and best practice suggests that, wherever possible, data files should be copied from CDs and DVDs to a new CD or DVD, between two and five years after they were first created. Additionally, it is good practice to check, at regular intervals, the data files on these discs.

Magnetic media, like hard drives or tapes, are also subject to physical degradation. As with optical media, discussed in the last paragraph, best practice suggests a similar migration strategy.

The National Preservation Office has published guidelines on Caring for CDs and DVDs.

We recommend that any storage strategy, even for a short-term project, should involve at least two different forms of storage, and that whichever is chosen the data integrity should be checked periodically.

Physical conditions

Most researchers do not need to maintain data storage systems which need to rely on industry-standard operating systems and adhere to international information security standards. However, areas and rooms designated for storage of digital or non-digital data should be suitable for the purpose for which they are being used. The conditions under which data are stored significantly affect their longevity. Optical media, like CDs and DVDs, are vulnerable to: poor handling; changes in temperature; changes in relative humidity; air quality; and lighting conditions.

Magnetic media, like hard drives, are equally sensitive to their physical environment. A personal computer is more likely to suffer from a fatal crash in a stiflingly hot office than in a temperature-controlled environment.

Printed materials and photographs are subject to degradation from sunlight and acid (e.g. from sweat on skin and in some kinds of paper). High quality media should be used for preparing paper-based materials from the outset, or for copies of originals, for example, using acid-free paper, folders and boxes and non-rust paperclips, rather than staples.

Data should be well-organised, easily located and physically accessible. The storage rooms should be structurally sound and free from the risk of flood and as far as possible from the risk of fire.

Data security

Data security is the protection of any data from unauthorised access, use, change, disclosure and destruction. Further details relating to data security are discussed on the security and controlled access to data pages.

Confidentiality

Storage of data may raise issues of confidentiality and consent. The risks of identification of personal information are typically maintained through the anonymisation of data and the provision of access through a dedicated rights management framework. See the consent, confidentiality and ethics in data sharing section. It is important, however, to be aware of the risks of storing personal data.

Legally, data which contain personal information must be treated with more care than data which do not. From mid-2008 financial penalties can be enforced for the wilful circulation of personal data.

Signed consent forms that usually contain identifying information should be stored separately from the data, although an anonymous ID system can help link the two sets of materials together if required (e.g. for re-contacting purposes).

Home | A-Z | Contact | Login | Print-friendly page




SEARCH

all UKDA web site
Data Catalogue



UKDA SERVICES Show/hide comment




Managing and Sharing Data
a best practice guide for researchers

PDF of Managing and Sharing booklet

Printed copies of the brochure are available on request from publicity enquiries.