CREATE & MANAGE DATA
STORING YOUR DATA
A data storage strategy is important because digital storage media are inherently unreliable and all file formats and physical storage media will ultimately become obsolete.
Media currently available for storing data files are optical media - CDs and DVDs - and magnetic media - hard drives and tapes.
At the Archive, where our business is the long-term preservation of digital data, we recommend our own principles on storage to researchers. Storage of data - both digital and non-digital - from research projects should be taken seriously from the start of research.
What data or file formats should I use to store data?
Our guide to preferred file formats for data preservation, which meet long-term readability requirements, gives advice on best file formats for various data types. Best formats are generally non-proprietary formats or formats based on open standards.
However, some proprietary formats, like Microsoft's Rich Text Format and Excel, are widely used and likely to be accessible for a reasonable time after any version has become obsolete. These formats are highly recommended for general use in any data storage environment.
A file extension does not necessarily refer to the file type, so a file with the extension .doc was not necessarily created with MS Word software, nor will it necessarily open successfully in that software.
What kind of documentation or metadata do I need to accompany stored data?
Comprehensive and accurate documentation is essential for informed and accurate use (and re-use) of data at any time in the future. Our section on data documentation and metadata provides detailed advice. Making digital versions of paper documentation in PDF/A format is recommended for long-term security.
What storage media should I choose?
The accessibility of any data is dependent on the quality of the storage media on which they are stored and the availability of the relevant data-reading equipment. An Amstrad floppy disc may still work perfectly 20 years after it was made, but the paucity of working machines means that the data on this disc may not be easily recoverable.
All optical media are vulnerable to poor handling, changes in temperature, relative humidity, air quality and lighting conditions. Data files should therefore be copied to new media between two and five years after they were first created. Additionally, it is good practice to check, at regular intervals, the data files on these discs. The National Preservation Office has published guidelines on Caring for CDs and DVDs.
Magnetic media, like hard drives or tapes, are also subject to physical degradation and should be regularly migrated to fresh media.
We recommend that any storage strategy, even for a short-term project, should involve at least two different forms of storage, e.g. on hard drive and on CD. Whichever form is chosen, the data integrity should be checked periodically.
What physical conditions do I need to provide for storing data?
Areas and rooms designated for storage of digital or non-digital data should be suitable for the purpose for which they are being used. Data should be well-organised, clearly labelled, easily located and physically accessible. The storage rooms should be structurally sound and free from the risk of flood and as far as possible from the risk of fire.
Most researchers do not need to maintain data storage systems which need to rely on industry-standard operating systems and adhere to international information security standards. However, the conditions under which data are stored significantly affect their longevity.
Optical media are vulnerable to poor handling, changes in temperature, changes in relative humidity, air quality and lighting conditions. Magnetic media, like hard drives, are equally sensitive to their physical environment. A personal computer is more likely to suffer from a fatal crash in a stiflingly hot office than in a temperature-controlled environment.
Printed materials and photographs are subject to degradation from sunlight and acid, e.g. from sweat on skin and in some kinds of paper. High quality media should be used for preparing paper-based materials for storage, or for copies of originals. Examples are using acid-free paper, folders and boxes and non-rust paperclips rather than staples.
How do I ensure my data are stored under adequate security?
Data security is the protection of any data from unauthorised access, use, change, disclosure and destruction.
How should I store confidential data?
Storage of data that are considered confidential or sensitive may need to be addressed during consent procedures, to inform the people to whom the data belong how and why the data will be stored. The risks of identification of personal information are typically maintained through the anonymisation of data and the regulation of access through a dedicated rights management framework.
It is important to be aware of the risks of storing personal data. Legally, data which contain personal information must be treated with more care than data which do not. From mid-2008 financial penalties can be enforced for the wilful circulation of personal data. Personal information can be removed from data files and stored separately under more stringent security measures.
Signed consent forms or other non-digital records may contain identifying information and should be stored separately from data files, although an anonymous ID system can help link the two sets of materials together if required (e.g. for re-contacting purposes).