CATALOGUE METADATA
Metadata, or data about data, are a subset of core data documentation which provide standardised structured information about a dataset. Metadata are typically used:- for resource discovery, providing searchable information that helps users to find existing data
- as a bibliographic record for citation
- Dublin Core
- General International Standard Archival Description (ISAD(G))
- Metadata Encoding and Transmission Standard (METS)
- ISO 19115 for geographic information
The use of standardised documentation in XML format brings key data documentation together into a single document, creating rich and
structured content about the data. Metadata can be viewed with web browsers, can be used for extract and analysis engines, and can enable
field-specific searching. Disparate catalogues can be shared and interactive browsing tools can be used. Metadata can be harvested for
data sharing in line with the Open Archives Initiative (OAI)
. UKDA currently
uses DDI2 and all metadata is Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) compliant.
- study description - information about the context of the created dataset such as bibliographic citation of the study and data, scope of the study (topics, geography, time), methodology of data collection, sampling and processing, data access information, and info on accompanying materials
- data file description - information on data format, file type, file structure, missing data, weighting variables and software
- variable descriptions
UKDA DDI catalogue record
A DDI catalogue record is created by the UKDA based on information submitted by data depositors at the time of data deposit. Such information is collected via the data collection deposit form and is also enhanced by information from accompanying documentation. UKDA staff edit this into an informative and rich metadata record.
Data depositors should therefore take care to provide detailed and meaningful information when documenting data and when completing the data collection deposit form, even if it means cutting and pasting from their technical or end of award reports (such documents might not always be available to the secondary user). Care should be taken to create meaningful titles, descriptions and keywords for datasets. The better the information provided, the better the resulting metadata and therefore the more useful the archived data become. Keywords are created for the catalogue record by UKDA staff, using controlled vocabulary from UKDA's thesaurus - Humanities and Social Science Electronic Thesaurus (HASSET).
Depositors are notified when the metadata record is available in the online UKDA Data Catalogue, providing the depositor with an opportunity to check the entry and provide any new or additional information.
Publications - primary and secondary
Data depositors are encouraged to provide information about original and subsequent reports and publications based on the data archived at UKDA so that these can be added as documentation for the dataset. Such publications provide useful context information for data re-users. UKDA also collates publications based on re-use of archived data.
Examples of outputs to submit or supply with bibliographic reference are:- end of award reports
- technical reports
- journal articles
- books and book chapters
- briefing papers
- press articles
- web sites
Bibliographic citation
All datasets archived at UKDA are accompanied by a bibliographic citation that data users are required to state in research outputs to reference and acknowledge accurately the data source used. A citation gives credit to the source and distributor and also includes copyright information. A recommended citation for each UKDA study is included in the 'Study Information' file that documents a dataset in the UKDA data catalogue and can be found in a file called 'UKDA_Study_XXXX_Information.htm' in the documentation table for every catalogue record. See here for an example in the Health Survey for England, 2005.
By treating archived datasets as bibliographic entities, publishing them in a data centre or archive catalogue and requiring them to be properly cited when used, the UKDA plays a major role in extending research and scholarship. The creation of a dataset which is properly documented and usable by other researchers deserves equivalent recognition and acknowledgement as published research outputs.
Bibliographic citation of data used in research identifies sources for validation and further research. Failure to cite datasets means that valuable data sources will not be indexed by bibliographical services such as social science citation indexes, and, more importantly, that other researchers who would like to analyse such data may not have sufficient information to acquire them.
















