MANAGE AND SHARE DATA


CATALOGUE METADATA

Metadata, or data about data, are a subset of core data documentation which provide standardised structured information about a dataset. Metadata are typically used:
  • for resource discovery, providing searchable information that helps users to find existing data
  • as a bibliographic record for citation
Metadata for online data catalogues or discovery portals are often structured to common standards or schemes such as: At the UKDA, these types of metadata are created in-house as a catalogue record for a data collection following the international standard Data Documentation Initiative (DDI). The DDI is an XML-based descriptive metadata standard for social science data used by almost every social science data archive in the world.

The use of standardised documentation in XML format brings key data documentation together into a single document, creating rich and structured content about the data. Metadata can be viewed with web browsers, can be used for extract and analysis engines, and can enable field-specific searching. Disparate catalogues can be shared and interactive browsing tools can be used. Metadata can be harvested for data sharing in line with the Open Archives Initiative (OAI) link to an external web page. UKDA currently uses DDI2 and all metadata is Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) compliant.

A DDI catalogue record contains mandatory and optional metadata elements on:
  • study description - information about the context of the created dataset such as bibliographic citation of the study and data, scope of the study (topics, geography, time), methodology of data collection, sampling and processing, data access information, and info on accompanying materials
  • data file description - information on data format, file type, file structure, missing data, weighting variables and software
  • variable descriptions
top

UKDA DDI catalogue record

A DDI catalogue record is created by the UKDA based on information submitted by data depositors at the time of data deposit. Such information is collected via the data collection deposit form and is also enhanced by information from accompanying documentation. UKDA staff edit this into an informative and rich metadata record.

Data depositors should therefore take care to provide detailed and meaningful information when documenting data and when completing the data collection deposit form, even if it means cutting and pasting from their technical or end of award reports (such documents might not always be available to the secondary user). Care should be taken to create meaningful titles, descriptions and keywords for datasets. The better the information provided, the better the resulting metadata and therefore the more useful the archived data become. Keywords are created for the catalogue record by UKDA staff, using controlled vocabulary from UKDA's thesaurus - Humanities and Social Science Electronic Thesaurus (HASSET).

Depositors are notified when the metadata record is available in the online UKDA Data Catalogue, providing the depositor with an opportunity to check the entry and provide any new or additional information.

top

Publications - primary and secondary

Data depositors are encouraged to provide information about original and subsequent reports and publications based on the data archived at UKDA so that these can be added as documentation for the dataset. Such publications provide useful context information for data re-users. UKDA also collates publications based on re-use of archived data.

Examples of outputs to submit or supply with bibliographic reference are:
  • end of award reports
  • technical reports
  • journal articles
  • books and book chapters
  • briefing papers
  • press articles
  • web sites
ESRC award holders should submit all research outputs (publications, reports, etc.) to the research outputs repository at ESRC Society Today for long-term preservation. UKDA will provide a link from the Data Catalogue record to the relevant award entry on ESRC Society Today.
top

Bibliographic citation

All datasets archived at UKDA are accompanied by a bibliographic citation that data users are required to state in research outputs to reference and acknowledge accurately the data source used. A citation gives credit to the source and distributor and also includes copyright information. A recommended citation for each UKDA study is included in the 'Study Information' file that documents a dataset in the UKDA data catalogue and can be found in a file called 'UKDA_Study_XXXX_Information.htm' in the documentation table for every catalogue record. See here for an example in the Health Survey for England, 2005.

By treating archived datasets as bibliographic entities, publishing them in a data centre or archive catalogue and requiring them to be properly cited when used, the UKDA plays a major role in extending research and scholarship. The creation of a dataset which is properly documented and usable by other researchers deserves equivalent recognition and acknowledgement as published research outputs.

Bibliographic citation of data used in research identifies sources for validation and further research. Failure to cite datasets means that valuable data sources will not be indexed by bibliographical services such as social science citation indexes, and, more importantly, that other researchers who would like to analyse such data may not have sufficient information to acquire them.

top
Home | A-Z | Contact | Login | Print-friendly page




SEARCH

all UKDA web site
Data Catalogue



UKDA SERVICES Show/hide comment




Managing and Sharing Data
a best practice guide for researchers

PDF of Managing and Sharing booklet

Printed copies of the brochure are available on request from publicity enquiries.