Our FAQs are split into different sections reflecting the main areas of our website:
About the UK Data Archive ↑ TOP
- What are data archives?
Data archives are resource centres for analysts who use data for research and teaching. Their functions usually include:
- ensuring that data are preserved against technological obsolescence and physical damage
- checking, validating and preparing data and accompanying user documentation
- cataloguing their technical and substantive properties for information and retrieval
- supplying them in an appropriate form to secondary users
- supporting users in using the data
The social science data archiving movement began in the 1960s within a number of key social science departments in the United States who stored original coded interview data deriving from academic surveys. The movement spread across Europe and in 1967 the UK Data Archive (the Archive) was established by the UK Social Science Research Council (now the Economic and Social Research Council (ESRC)). In the late 1970s many national archives joined wider professional organisations to foster co-operation on key archival strategies, procedures and technologies; encourage the exchange of data and technology across national boundaries; and promote the acquisition, archiving and distribution of electronic data for social science teaching and research.
- What services does the UK Data Archive provide?
The UK Data Archive is curator of the largest collection of digital data in the social sciences and humanities in the United Kingdom. With several thousand datasets relating to society, both historical and contemporary, our Archive is a vital resource for researchers, teachers and learners. We acquire high quality data from the academic, public, and commercial sectors, providing continuous access to these data while we also support existing and emerging and communities of data users.
We are an acknowledged centre of expertise in the areas of acquiring, curating and providing access to data. Since 2005 our archive has been designated a Place of Deposit by the National Archives allowing us to curate public records.
We direct and manage the UK Data Service which is the UK's flagship portal for research resources, where we host key national and international survey data, databanks, census data and and qualitative data. More recently we have been providing secure access to disclosive and more sensitive data. We have been engaged in a number of data management initiatives supported by ESRC and Jisc including running the Rural Economy and Land Use Programme (Relu) Data Support Service which focused on providing access to key environmental data. We also provide curation of data for other organisations at a cost.
We are proud to be actively engaged in research and development, and throughout our 40+ year history, we have remained at the cutting edge to become an international leader in curating and sharing data.
- Where do your data come from?
Data collections are deposited from a variety of sources, including academic researchers, government departments, intergovernmental organisations, independent research institutes, and market research organisations.
Academic research funded by the Economic and Social Research Council (ESRC) is an important source of data, as the council operates a mandatory Datasets Policy whereby all award holders are expected to offer data to the UK Data Archive for archiving. Examples of large-scale ESRC datasets are the British Social Attitudes Survey (BSAS), the British Election Studies (BES) and the British Household Panel Study (BHPS).
Censuses and large surveys carried out by governments for their own policy purposes are particularly rich sources of data for further exploration. Central government, and in particular, the Office for National Statistics (ONS) is a major and regular supplier of data series, including the General Household Survey (GHS), the Labour Force Survey (LFS), and the Health Survey for England (HSE).
Some data collections may not have been collected specifically for research purposes. Administrative databases, such as the National Health Service Patient Re-registrations, although collected for a very different purpose, can provide valuable and timely information for researchers.
- What is the UK Data Service?
The UK Data Service provides a unified point of access to the extensive range of high quality social and economic data, including Census data, government funded surveys, longitudinal studies, international macrodata, qualitative data and business microdata. It is designed to provide seamless access and support to meet the current and future research demands of both academic and non-academic users, and to help them maximise the impact of their work.
The UK Data Service is a new service which replaces the earlier ESRC investments of the Economic and Social Data Service (ESDS), the Secure Data Service (SDS), the Survey Question Bank and elements of the ESRC Census Programme. It also incorporates outputs from the cross-council Relu Data Support Service (Relu-DSS).
- What happened to the Relu Data Support Service?
The Rural Economy and Land Use Programme Data Support Service (Relu-DSS) was a dedicated support service funded by the ESRC, NERC ad the BBSRC to provide information and guidance to researchers and project managers from the programme on data management, data sharing and preservation. Outputs from the portal are available.
- What happened to the ESDS, SDS and SQB?
The Economic and Social Data Service (ESDS) is now integrated into the UK Data Service, along with the former Secure Data Service (SDS), Survey Question Bank (SQB), elements of the ESRC Census Programme and outputs friom the the Relu Data Support Service.
- What services do you provide for guidance on research data management?
The Research Data Management Support Services team at the UK Data Archive is actively involved in developing knowledge, guidance and capacity building in managing and sharing data, primarily for the social sciences. Our up-to-date guidance can be found on the UK Data Service website, together with links to our book published in 2014 by Sage Publications.
We run training events on various aspects of managing and sharing research data.
Our team undertakes a range of R&D projects on research data management support and training for researchers and research centres, through grants from the ESRC, Jisc and the EU.
Create & manage data ↑ TOP
- Do you provide guidance on how to manage data?
Our most up-to-date guidance on how to manage and share data can be found on the UK Data Service website. This provides data creators, data managers and data curators with best practice strategies and methods for creating, preparing and storing shareable datasets.
Preparing data for archiving ↑ TOP
- How do I prepare my dataset for secondary research?
Data should be prepared in such a way to enable the data to be used by other researchers, and for the data archive to be able to create accurate catalogue records. Researchers are encouraged to document data appropriately, (see our guidance on data documentation and metadata), to include research procedures and fieldwork methods and to ensure that data are held in an organised manner. Documentation is invaluable in enabling secondary users to contextualise data and conduct better informed re-use of the material. Any consent and confidentiality concerns which may inhibit archiving data should be resolved. See our guidance on consent and confidentiality.
- What are the costs of preparing the data and documentation for archiving?
There should be no additional costs for archiving data, other than researcher(s) time to prepare data and documentation for deposit. This time should be costed into the application. On average it is recommended that two to three weeks are costed into an average two year research grant application to prepare and collate materials for deposit. However, owing to the disparate nature of research and data creation, we cannot provide advice on exact costs likely to be incurred in data preparation.
Various activities typically associated with preparing data are outlined below, for which you can work out appropriate costs in terms of people's time and equipment/software needed. Much data preparation can be carried out as part of the research process during data entry and transcription, therefore significantly reducing the cost of preparing data for archiving.
For quantitative/numerical data, allow time to add appropriate variable, value and code labelling to data, to create SPSS set up files (if relevant), to supply the syntax or logical statements for derived variables etc. as part of data-level documentation.
For qualitative data, provision should be made for the full transcription of interviews, focus groups discussions and so on, where the budget will allow. Transcription cost should be included in the overall research budget. If full transcription is really not feasible, interviews or focus groups should be fully summarised. For transcript in non-English languages, English summaries should be prepared for archiving, so costs for translation might be appropriate. A data listing giving details of each interview should be created.
Consent and data confidentiality also impact on costs for archiving. Consent for data archiving should be arranged during research when consent for participation and data use is obtained, or arranged afterwards. Allow time to anonymise data, where required. Ideally anonymisation should be undertaken during the project but will need to be checked before archiving the data. The time involved should not be underestimated as anonymisation appropriate for archiving may require the use of pseudonyms, or the preparation of an anonymisation key. The anonymised document should be meaningful and usable by other researchers. See anonymising research data guidelines and techniques for detailed guidance.
The cost of digitisation of non-digital sources, e.g. if the research involves work on a paper-based collection, can usually be included in the overall research budget.
Additional suggestions and requirements on preparing data and documentation for archiving can be found in the 'Create and Manage' pages.
- How can other researchers understand and therefore use my data?
Data can only be understood and used to their full potential by other researchers if they are adequately documented, see documenting your data. Any potential re-user should understand exactly how the research was carried out and what the data mean. The data creator should provide sufficient information on the objectives and methodology of the research; explain the data collection methods used; explicitly describe the meanings of variables and codes used and any derivation, transformations or data cleaning carried out.
- Must I transcribe interviews if I want to archive them?
It is recommended that transcriptions of interviews are made. Full transcriptions significantly extend the potential for analysis and re-use of a research collection, both by the original researchers and by secondary users. Transcription should be seen as a step within the analytical process of research, rather than as a mechanical conversion of data. If interviews are not transcribed, then recorded interviews could be archived alongside summaries. See guidance on transcription.
- How do I transcribe interviews?
Audio-visually recorded interviews are usually transcribed manually, see guidance on transcription. A standard transcription structure is recommended if transcripts are to be archived or if Computer Assisted Qualitative Data Analysis (CAQDAS) software is to be used to analyse the data.
Transcriptions possess a unique identifier, adopt a uniform layout, make use of speaker tags clearly indicating the question/answer sequence, carry line breaks, be page numbered and carry a document header giving brief details of the interview: date, place, interviewer name, interviewee details, etc.
- Who do I contact if I need further advice on preparing data for archiving?
If after reviewing the information on preparing and documenting data for sharing and archiving, any query or question remains unresolved, get in touch with the Producer Relations team through the UK Data Service.
Definitions and legal aspects ↑ TOP
- What is, and what is not, personal information in research?
Personal data/information, according to the Data Protection Act, are data that relate to a living individual and from which an individual can be identified (from the data alone or from the data in combination with other accessible information). It also includes any expression of opinion about the individual and any indication of the intentions of the data controller or any other person in respect of the individual. See further details on Data protection and sharing research data.
Personal information may include photographs, email messages and data recorded by closed-circuit television (CCTV), if a person can be identified from this. It also includes data identified by reference numbers, where a separate list can be used to match the reference numbers to named individuals.
It, however, does NOT mean that any information provided during research by a person (e.g. during interviews) is personal information. If a person cannot be identified from the information, then the information is not defined as personal information.
- What is defined as sensitive personal data?
Sensitive personal data are defined in the Data Protection Act 1998 as data on a person's race, ethnic origin, political opinion, religious or similar beliefs, trade union membership, physical or mental health or condition, sexual life, commission or alleged commission of an offence, proceedings for an offence (alleged to have been) committed, disposal of such proceedings or the sentence of any court in such proceedings. All potentially 'sensitive' data should be carefully handled.
- Does the Statistics and Registration Service Act have any relevance to data sharing?
The Statistics and Registration Service Act applies only to data designated as Official Statistics. Data access is an express statutory function of the Statistics Authority and the Act defines the legal gateways under which 'personal information' can be disclosed. The Act permits disclosure of personal information to an Approved Researcher, i.e. an individual to whom the Statistics Authority has granted access, for the purposes of statistical research, to personal information held by it. The criteria for access require the Statistics Authority to consider whether the individual is a fit and proper person, and whether the purpose for which access is requested is valid. The Act also states that disclosure of personal information outside of the legal gateways is a criminal offence.
The Act does not apply to individual researchers who are managing confidential research data that are not designated as Official Statistics. Further information on the Act is available at legislation relevant to data sharing.
- Can a person use the Data Protection Act to request personal information that has been deposited in an archive?
A typical question presented to the Archive is whether a person can request the release of personal information held about him/her in an interview that had been given by a third party, the research participant? The holder of personal information (the archive) does not have to comply with the request if the third party (who provided the information) has not consented for the information to be disclosed. If the information given by the third party was given to the researcher, and then to the archive, with the expectation it would be kept confidential, then the archive is not obliged to release the information. The UK Information Commissioner's Office (ICO) technical guidance note recommends that "in most cases where a clear duty of confidentiality does exist, it will usually be reasonable to withhold third party information unless you have the consent of the third party individual to disclose it." See, Dealing with subject access requests involving other people's information.
- Can a person use the Freedom of Information (FoI) Act to request research data that has been deposited in an archive?
If research data were given in confidence to an archive and the release of such data would breach this confidence, then a FoI request to disclose such research data can be refused.
Such research data could be data that under the licence agreement are only available to researchers, or may even have been placed under more restrictive access conditions due to the data being confidential. In addition, if a 'record' (e.g. research data) contains personal data (under the definition of the DPA) or Personal Information (under the definition of the Statistics and Registration Services Act) that would allow a person to be identified from the data, then such data can not be released under the FoI Act.
In general, a FoI request entitles access to the content of information held, not necessarily to an exact original document (e.g. an interview transcript or dataset).
- Is there a recommended period for which data should be stored by a researcher?
There is no overarching recommendation on length of time data should be kept, but some disciplines have firmer requirements than others. For example, in the UK:
- medical records must be held for anything from 6 years to forever according to the British Medical Association
- clinical trials master data files need to be held for 5 years after the trial has finished
- research data funded by the Engineering and Physical Sciences Research Council must be securely preserved for a minimum of 10-years from the date that any researcher 'privileged access' period expires
However, for most social science data there is no recommended retention period for data, unless you have gained third party data under licence which actively requires you to destroy it after use.
Retention needs to be balanced against data protection requirements, which state that personal records (names, addresses and so on) should not be held for any period longer than necessary. This does not apply to most research data, but does apply to associated administrative data associated with a research project, unless you have sought explicit permission to keep it for a particular defined purpose.
Consent and ethics ↑ TOP
- Can a participant give consent for data archiving when the data contain sensitive information about other individuals?
It is likely that a participant may have no appreciation of the Data Protection Act, duty of confidentiality or privacy laws, in other words, what information should and should not be shared. However, a participant giving consent for the data they have provided to be shared should be able to make his/her own judgement. If, as a researcher, you are uncertain as to whether your data contain any possibly damaging or incriminating information, we would recommend that you remove or alter those sections before data are shared (ensuring any removal is indicated in the data). We can advise on this on a case-by-case basis. Get in touch with the Producer Relations team through the UK Data Service.
- At what stage in the research process should I seek consent for data sharing?
There are a number of options dependant on the type of research being conducted. Consent should ideally be sought whilst conducting the research e.g. at the time of an interview or survey. However, at times it may be more logical to obtain consent for data sharing at a later stage, when the participants have a better understanding of the data in question. It is also possible to return to participants to seek consent for data uses (such as data sharing) possibly not discussed at the time of fieldwork. See also gaining consent for information on one-off or process consent.
- Do I need consent for data sharing for textual and audio-visual material separately?
Not unless the data will be re-used and shared for different purposes e.g. if only transcripts will be archived and audio recordings not. In that case consent for data sharing is needed for transcripts, but not for audio. However, it is recommended for researchers to obtain consent that applies equally to all the materials resulting from the research.
If data need to be anonymised, there may be a problem for sharing audio-visual data, as they are not easily anonymised and the damage to data integrity may be higher. In such cases separate consent for data sharing may need to be sought for textual and audio-visual data, whereby anonymised textual data can be shared whilst non-anonymised audio-visual may be preserved differently. See information on consent for using or sharing audio-visual material.
- If research is carried out in a public environment and data are obtained from multiple people (e.g. video footage, sound recordings), how do I obtain consent?
At a minimum, the law requires notification of the recording through clear signage. For example, a researcher video recording interaction and conversation in a shopping centre could not brief everyone in that location face to face. It would be more appropriate to have information sheets/signs about the project and the recording displayed in the location.
- Do I need to gain parental consent when doing research amongst children?
Based on a legal ruling, young people aged 16 years and above can give their own consent. For younger children, a judgement must be made about their ability to understand what is being asked of them. They should have clear and intelligible information about the project, suited to their level of understanding. They should be asked for their individual, voluntary consent, in addition to that of a parent/guardian and/or head teacher. See detailed information on gaining consent in research with children.
- My research takes place in a developing country. How do I explain data archiving and secondary use of data to participants?
Explain in a way that is appropriate to the cultural environment that the information they give you may be used by other like-minded researchers. If you have made certain agreements, e.g. that data will not be seen by government officials or the press, make it clear these promises will still apply to archived data.
- Does the UK Data Archive ensure that ethical guidelines were followed by researchers for the collection of data that are archived by the Archive?
The UK Data Archive assesses data that are offered for archiving for potential confidential, sensitive or personal information. We also seek information on whether or not researchers have obtained consent for data sharing. Where we have concerns, we discuss with depositors options such as further anonymisation of data or renegotiating consent - where this is not possible, data may be rejected from the collection. In this way, we ensure the ethical re-use of its archived data. Note, however, that the Archive is not an ethics review committee - it focuses only on the ethics of data sharing. In addition, we advise researchers on issues of informed consent, anonymisation and research ethics and encourages researchers to adhere to appropriate ethical guidelines when collecting data.
- Are there special considerations for gaining consent for data sharing from children or vulnerable or elderly populations?
Yes, special considerations are needed for gaining consent for data sharing because of the nature of the sample group, similarly to obtaining consent for research participation from these populations, see Special cases of consent.
Consent forms ↑ TOP
- Must I always use consent forms when gathering data from people?
No, the use of written consent forms in research is not mandatory. Obtaining informed consent for people's participation in research and for the use of the data gathered for various research purposes is an ethical requirement for most research and is typically required by Research Ethics Committees. Whether such consent is obtained verbally or in writing, using a brief informative statement or a detailed consent form, depends on the nature of the research, the kind of data gathered and how the data will be used.
Although obtaining consent in writing is recommended where possible, as it reduces the uncertainty over what was agreed between researcher and participant, it may be too formal for some research with people. It is the responsibility of the researcher to address this problem.
For non-sensitive data gathered during quantitative surveys (questionnaires) or informal interviews, where no personal data are gathered or where personal identifiers are removed from the data, obtaining written consent may not be required. At minimum an information sheet should be provided to participants detailing the nature and scope of the study, the identity of the researcher(s) and what will happen to the data collected (including any data sharing or archiving if applicable). If, however, a survey extends beyond asking questions to include activities such as a nurse visit, taking samples, making physical measurements, etc. then written consent is usually required.
If data are collected verbally through audio-recordings, verbal consent agreements can be audio-recorded together with the data.
If personal data, sensitive data or confidential data are gathered during the research, the use of written consent forms is recommended to assure compliance with the Data Protection Act, see Data protection and sharing research data and with ethical requirements.
For personal and/or sensitive data to be processed, archived and disseminated by the Archive, explicit consent is needed from each participant. Ideally this should be in writing.
Further information is available on the options of written or oral consent, on the use of consent forms and on consent in surveys under the consent pages.
- Can you help me draft my consent form that will take into consideration the sharing/archiving of research data?
The UK Data Archive Producer Relations team can comment on draft consent forms. Detailed examples for various types of research that take into account data sharing are available, see consent forms. Information specific to data archiving, that explains the procedures, benefits and risks of archiving and sharing data, is available for researchers, see why share data and informing research participants.
- How and where should I store consent forms?
If consent forms contain personal information on the participants that could result in the disclosure of the participant's identity from the data, then they should be stored separately from the research data (according to the Data Protection Act). The length of time that consent forms should be stored will be informed by institutional or research ethics committee requirements. A blank sample consent form can be archived at the UK Data Archive alongside research data as part of the documentation providing background information to the data.
- Do I submit all participants' consent forms to UK Data Archive when I deposit data for archiving?
No, since consent forms contain personal information, they are not archived at the Archive alongside research data. It is the researcher's or the research institution's responsibility to store consent forms safely and to decide how long they should be kept. A blank consent form and information sheet can be archived at the Archive alongside research data as documentation providing background information on data.
There are two reasons for asking research participants for copyright to be transferred to the researcher. First, it gives the researcher the right to publish extracts based on the words of a participant (e.g. interview or diary extract) without needing to return to the participant each time to obtain permission for publishing. Second, it allows the researcher to authorise a third party such as the Archive to make copies of digital materials for the purpose of digital preservation.
Disclosure ↑ TOP
- What happens if a data re-user discloses confidential or identifying information from data archived at the Archive?
Restrictions on the use of archived data obtained by users from the UK Data Archive are outlined in the End User Licence (EUL). All users agree to this when registering prior to downloading data. In particular there is a fundamental restriction concerning the confidentiality of data. Users should not attempt to use the data to deliberately compromise the confidentiality of individuals, households or organisations and are required to abide by the current Data Protection Act. The EUL also covers requirements for citation of publications and safeguarding of data. The University of Essex may refer any breach of the EUL for legal action under relevant legislation.
In addition, data re-users have the same ethical and legal obligations as primary data users and researchers in general to not disclose confidential or identifiable information from research data.
- If a researcher obtains information on illegal or criminal activities (e.g. child abuse), is there a legal obligation or moral duty to disclose this to the relevant authorities?
Exceptions to the duty of confidentiality occur where there is a legal compulsion, for example, the information may be subpoenaed by relevant police investigations or court proceedings, or where there is a disclosure of the information made 'in the public interest', as defined by the courts. There are no mandatory reporting laws in the UK but guidance issued by professional bodies and local safeguarding children boards emphasises the need to make a referral where there is a reasonable belief that a child is at risk of significant harm. There are thus ethical obligations on researchers working with children to make provision for the required actions to be taken in cases of disclosure of e.g. child abuse. Under the Children Act 1989 (England and Wales), the Children (Scotland) Act 1995 and the Children (Northern Ireland) Order 1995, the local authority has a duty to make enquiries about any allegation of abuse (is suffering, or is likely to suffer, significant harm). Additionally, some researchers are members of professional groups such as teachers and social workers who have a legal duty to report suspected child abuse.
- In an increasingly surveyed society what access do government departments have to data held at the UK Data Archive and to personal data of participants, e.g. through researchers working for government departments?
Researchers working in or for government departments access data held at the UK Data Archive under the same restrictions as HE researchers. All researchers agree to an End User Licence (EUL) before data can be accessed. This EUL poses restrictions on how data can be used. All researchers wishing to access data held under a special licence must apply for access through the Approved Researcher route.
Data sharing and confidentiality ↑ TOP
- My Research Ethics Committee advises me not to share research data or requires me to destroy them. What should I do?
It is important to distinguish between personal data collected in research, and research data in general. In the case of personal data, those should not be disclosed (unless consent has been given that they can be disclosed). A Research Ethics Committee may indeed request that personal data collected during research, i.e. data that can identify participants, are destroyed after a certain time period to avoid possible disclosure (for example if data would be left unattended on an old PC). Identifiable information could also be excluded from data sharing. A Research Ethics Committee should not, however, ask you to destroy research data in general.
If research data contain sensitive or confidential information, then the sharing of such data should be considered carefully, but should not be dismissed as being impossible. If researchers need advice on how to address the sharing of research data as part of their ethical review, or if there exist conflicts between the need to archive data and a Research Ethics Committee's guidelines on data management, they can consult the information on working with RECS or they can contact the Producer Relations team,
- Would I be breaching confidentiality towards my participants if I archived my data?
A researcher does have a duty of confidentiality towards informants with regards information obtained from them. An exception to this duty of confidentiality is when the informants have consented to the information being used in specific ways and for agreed purposes. For the purposes of sharing or archiving data, researchers should make clear to informants that information will be shared with other academic researchers under strict terms and conditions, and should indicate how data may be anonymised where necessary.
It is important to demonstrate the agreement on confidentiality and data sharing between researchers and participant by obtaining consent from informants for the use of the information obtained for the purposes of research, publication, and data sharing. Ideally consent is obtained in writing.
Confidentiality of data therefore does not prohibit the archiving of data, as long as informed consent is obtained from informants to archive and share data, or where data are anonymised. Detailed information on this topic can be found on the pages on consent, confidentiality and ethics in data sharing.
All users of data archived at the UK Data Archive are registered users - data are therefore not in the public domain - and users sign an End User Licence that legally binds them to maintaining appropriate confidentiality of data.
- I believe my data are confidential and I cannot submit them for sharing? What should I do?
Not all research collects data that is confidential or even sensitive. Even if you do, this does not automatically prohibit data sharing. It is common practice for researchers to obtain informed consent to use data for their research and publication purposes. In the same way, consent can be obtained for archiving purposes to allow secondary use of the data for research. Where necessary data can be anonymised, or access and usage can be restricted, to safeguard sensitive information. Detailed information on this topic can be found in the section on consent, confidentiality and ethics in data sharing.
Any issues anticipated regarding confidentiality or sensitivity of data should be addressed at the start of research so strategies to overcome such problems and to enable archiving of the data can be developed in time.
- Do you have information that I can give to participants on how a data archive protects the confidentiality of interview data?
Data archives value the data deposited with them and take their duty very seriously to make sure the materials are used only in ethical and appropriate ways. Detailed information for participants, explaining what researchers and data archives jointly do to protect the confidentiality of interview data, whilst enabling data archiving and sharing for research purposes, is available at informing research participants.
- I did not ask for consent from my informants to use data beyond my own research. Can they be archived?
In the case of quantitative data that are adequately anonymised, there is strictly speaking no need to obtain separate consent for archiving in order to enable their use by the wider research community (although it is ethically recommended to do so).
If substantial descriptive (qualitative) information obtained from informants is to be archived and informants were not asked for their consent to archive this information, they can still be re-contacted to obtain their consent. However, it is possible to share qualitative material that possesses no disclosure risk. If consent forms were presented and informants chose not to give consent for the archiving and re-use of qualitative data, then the data cannot be archived.
- Are there alternatives to standard access to research data held at the UK Data Archive, in the case of confidential data?
For especially confidential research data, additional access restrictions may be imposed beyond the standard licensed access. Data access authorisation may be required from the data owner prior to release of the data; or confidential data may be placed under embargo for a given period of time. This is decided on a case-by-case basis in dialogue between ourselves and the data owner.
- Are there research data which cannot be shared?
Personal data or sensitive data may not be suitable for sharing with other researchers, depending on the informed consent that has been obtained from participants. Also data for which partial copyright lies with parties other than the researcher cannot be shared unless permission for data sharing has been given by all copyright holders. The Archive asks for specific information on such circumstances when data are being offered for archiving so data can be assessed by us to ensure that they can be shared with other researchers in an ethical and legal way.
Copyright ↑ TOP
- Do archived data become the property of the data centre or archive?
No, archived data remain the property of the original data creator(s). The data centre or archive preserves, stores and disseminates the data for you, but does not own the data or hold any rights in the collection, unless added-value work such as transcription has been undertaken as part of processing in-house.
- Who holds copyright of data?
Copyright, an Intellectual Property Right reflecting the output of human intellect, applies to creative and artistic original work including written work, spoken word, photographs, databases, research data, etc. Copyright is automatically assigned and does not need to be applied for.
Usually copyright is retained by the author of the original work; this could be an individual, organisation or institution. If a piece of work is completed as part of employment, the employer will retain copyright of the work. Anyone who is commissioned to create a piece of work on behalf of someone else will retain copyright of that work. See detailed information on copyright.
When data have been created from a variety of sources or if the research has been funded by a number of organisations, there is shared copyright for all involved parties. In these cases permission to archive data must be sought from all interested parties and a covering letter confirming agreement should accompany the materials when deposited.
- Who holds copyright of in-depth interviews?
The speaker holds the copyright in the spoken word. Transcription of the words on paper or computer is protected by copyright and is owned by the person making the transcription. If the transcription is a substantial reproduction of the words spoken, the speaker will own copyright in the words and a separate copyright will apply to the transcription. This is of particular relevance to the recording of in-depth interviews. This also applies to a recording on tape or video. The person making the recording will own the copyright in the recording and the interviewee will own the copyright in the words.
Copyright can only be transferred in writing and signed by the person making the transfer. This document is called an assignment. If researchers wish to publish large extracts from an interview, it is advisable to obtain a transfer of copyright from interviewees.
- If I use existing data and combine them with data I have collected myself, do I then have copyright of the new material?
Yes, but not sole copyright to the new material. The creator (author) of the existing data used for the research will still retain copyright in that material. For the purpose of data archiving, permission is needed from the person/organisation holding copyright of existing data to archive the new data.
- How are the rights of the original copyright holder protected when data archived at the UK Data Archive are re-used by other researchers?
Access conditions for the dissemination of all materials deposited at the UK Data Archive are agreed between the Archive and the data depositor at the time of deposit, using the licence agreement. This agreement is a contractually binding legal document. Similarly, when materials are requested from the Archive all users agree to an End User Licence whereby the user undertakes to abide by all conditions stipulated in the agreement. This must be completed before any materials are supplied to any user. It is the combination of these contractual agreements that ensures copyright infringement does not occur.
- If I pay for data to be created, but not by an employee, can I retain copyright?
Yes. You will need to formulate an agreement with the person commissioned to create the data, stating that copyright is to be assigned to you.
- Who holds the rights in a database structure and content?
Database rights were introduced in 1996 (Directive 96/9/EC on the legal protection of databases and Copyright and Rights in Databases Regulations 1997) exclusively to protect databases. If the directive applies, the owner can prevent unauthorised extraction or reutilisation of all or a substantial part of the contents. The rights last for 15 years and can be renewed. This may pose problems for those wishing to reuse data for research purposes but the 'fair dealing' exception (which is restricted to extraction) by a 'lawful' user allows illustration for non-commercial teaching or research. Substantial is also not clearly defined. If in doubt about sharing your database contact the Producer Relations team in the first instance.
- Some journals ask authors to make available the data used for a publication. How do I comply with this and copyright laws?
Some journals require authors to submit data alongside a publication so that the published results can be replicated by others. Note that data obtained from the UK Data Archive including subsets and derived data, cannot be submitted to journals alongside publications, as this would be a breach of the End User Licence (EUL) that users agree to when they register. However, In most cases it is sufficient for the author of the publication to supply information to the journal about the data and its location, using a proper citation.
In addition, for derived data there are a number of options available including:
- the author can supply the syntax used to the journal
- the author can offer the data to the Archive - it is a requirement of the EUL that any derived data be offered for deposit - see Deposit data
- the author can request that anyone wishing to replicate the results should apply to the author
- before passing on the data, it is essential for the author to first check with the UKDA/ESDS that any applicant is a registered user and also entitled to access the data
Formats ↑ TOP
- The data resulting from my research are in proprietary software (e.g. transcripts in N-Vivo). How do I archive them?
The Archive has a preservation strategy which ideally archives data in a non-proprietary open format, so they are software and hardware independent. This enables wider use, easier access and guarantees long-term preservation of archived data. If your data cannot be converted to a standard non-proprietary format, we cannot guarantee their long-term accessibility. An example might be if you have N-Vivo or Atlas-ti files, which are held within a proprietary fixed format which is not totally exportable. These project-based files would be acceptable as long as you also kept the raw data files in MS-Word, RTF or plain text formats. See data formats and software for examples of acceptable data formats.
- The data resulting from my research are audio-visual recordings of interviews and a collection of photographs and artefacts. In what format can these be archived?
We judge each data offer on a case-by-case basis. Whilst it is preferable for research purposes that recorded interviews/discussions are transcribed (as it makes re-use of such data much easier), at times audio-visual materials are archived too. Paper-based artefacts, such as photos, postcards, family trees, could be digitised. We can discuss formats with you if you are unclear. Contact the Producer Relations team,
Deposit data ↑ TOP
- What do you mean by data?
In the context of data archives, data means digital data.
Data can consist of different types of qualitative or quantitative materials, for example: numeric data files; survey databases; interview transcripts; diaries; field notes; digitised materials; audio recordings; photographs; and modelling scripts.
A data collection can result from primary data collection or can be derived from existing sources of data.
Quantitative (numeric) data can be either microdata or macrodata. Microdata consist of the coded responses to survey questions (for example a 'yes' response could be coded as '1' and a 'no' response as '0') where each row of data corresponds to an individual, household, family, or organisation and each column corresponds to a survey question. Microdata are usually made available in SPSS, Stata, Excel and tab-delimited formats. Macrodata consist of aggregate figures (for example country-level economic indicators) and can usually be viewed and analysed using MS Excel. Data can also be a database containing survey data, numeric data files, input data and script used to model scenarios.
Qualitative data includes in-depth interviews, diaries, anthropological field notes, images, audio recordings and the complete answers to survey questions. Qualitative text material is typically available as word-processed documents or databases.
To analyse the data, users usually need to have access to the appropriate software, although some of our services have online data browsing tools.
- Where do your data come from?
Datasets are deposited from a variety of sources, including academic researchers, government departments, intergovernmental organisations, independent research institutes, and market research organisations.
Academic research funded by the Economic and Social Research Council (ESRC) is an important source of data, as the Council operates a mandatory Datasets Policy whereby all award holders are expected to offer data to the UK Data Archive for archiving.
Examples of large-scale ESRC datasets are the British Social Attitudes Survey (BSAS), the British Election Studies (BES) and the British Household Panel Study (BHPS). There is also a large selection of qualitative research data collections and numerous smaller data collections resulting from academic research projects.
Censuses and large surveys carried out by governments for their own policy purposes are particularly rich sources of data for further exploration. Central government, and in particular, the Office for National Statistics (ONS) is a major and regular supplier of data series, including the General Household Survey (GHS), the Labour Force Survey (LFS), and the Health Survey for England (HSE). Some datasets may not have been collected specifically for research purposes. Administrative databases, such as the National Health Service Patient Re-registrations, although collected for a very different purpose, can provide valuable and timely information for researchers.
We are keen to acquire more data from government departments and commercial organisations. If you have data to offer, please contact us for advice.
- Why should I share my research data?
The value of any data lies in their use and reuse. In addition, publicly funded research data are produced in the public interest and therefore may need to be shared more widely. When data are managed well they can be shared and reused for scientific and educational purposes. Researchers, funding agencies and the public benefit from data sharing.
- adds value to public investment
- facilitates diverse analyses often beyond the scope of the initial research
- encourages scientific inquiry
- avoids replicating data collection
- supports research into data collection methods
- provides resources for education and training
- facilitates the creation of new data through combination of existing datasets
- What are the benefits of depositing data with an archive or data centre?
Data archiving has great benefits for data owners, data users and researchers. Depositing data ensures their safe-keeping in the long term, with control maintained by the data centre on behalf of the data owner. This can include informing the owner of applications for use and maintaining registers of users and usage. The ability to demonstrate continued usage of the data after the original research is completed can influence funders to provide further research money.
The use of archived data by other researchers may lead to collaborations with the data owner and to co-authorship of publications based on reuse of the data.
Depositing data allows data owners to avoid the administrative tasks associated with external users and their queries. At the same time the data holders can foster a fruitful dialogue between original and secondary researchers by running user groups and data-use workshops while shielding the original researchers from the more tedious aspects of dissemination.
It is also an essential part of the scholarly research process to be able to identify information sources. Bibliographic control of books, papers, journals and other printed sources is taken for granted. They are identifiable in library and publishers' catalogues and, when used as source material in scholarly publications, are fully referenced. The depositing of data enables datasets to be as fully identifiable and easy to find as printed materials by ensuring that:
- datasets are fully documented with all bibliographical details (title, date, author)
- datasets are fully catalogued in a data catalogue
- users of the data are aware of the need to acknowledge the data sources in publications, through proper citation
- Scientific excellence is measured through publications. Why would I archive research data rather than publish results?
The archiving of research data by no means replaces scientific publications. Archiving data resulting from a research project provides an additional output besides many other research outputs. Archived data can complement publications and provide the baseline data used for publications. Some datasets can be significant in their own right and may qualify as part of a researcher's research portfolio.
- Will my archived data be in the public domain?
- Who are likely to be the potential users of an archived dataset?
Most potential users will be within the higher education and further education research communities. Archived datasets are also frequently used for teaching purposes to study research methodologies and how researchers approach studying certain topics.
- Will data be peer-reviewed as a quality assurance?
Data offered for archiving are not subjected to peer review. The data creator has the responsibility to ensure high quality of data, both at the stage of data collection and at the stage of data entry or transcription. The Archive will carry out certain quality reviews of deposited data during processing to ensure that variables and values are accurate according to the documentation supplied and are well labelled; to check for missing or erroneous values; to check that confidentiality is not breached; etc.
- Do you have information on the purpose of data archiving, so I can explain it to research participants?
Easy-to-understand information that can be given to participants about what archiving their data means can be found on the Informing research participants page.
- Are there research data which cannot be shared?
Personal data or sensitive data may not be suitable for sharing with other researchers, depending on the informed consent that has been obtained from participants.
Data which include multiple copyright layers or rights owners cannot be shared unless permission for data sharing has been given by all copyright/rights holders. Depositors must ensure all rights have been cleared and provide information to confirm that any legal and ethical issues have been resolved. This enables the Archive to legitimately preserve and make the data available for analysis.
If there is any doubt about rights ownership, please consult your institutional legal team and refer to the Copyright section on the Manage and share web pages.
- Should researchers or institutions keep a copy of their data after they have been archived at the Archive?
Once data are archived at the UK Data Archive, the data owner can access those data at all times. The Archive will safeguard and preserve the data in the long term, so they remain accessible. Researchers may keep a copy of their data at their own discretion.
- I have an ESRC grant and have been asked to complete questions about archiving my data. Can you help?
The ESRC is keen to ensure that grant applications do not propose collecting data in areas of research where existing data may already exist. ESRC also expects award holders to share their data at the end of their award, so they can be made available to other researchers. For help with answering questions on data collection and data sharing on the ESRC application form and the Data Management Plan, please refer to the Data management planning for ESRC researchers on the UK Data Service site.
- My research was carried out abroad and my data are not in English. Can they be archived at the Archive?
The Archive does not exclusively archive English language data. However we cannot guarantee to verify and quality control these data to the same standards as English language data. The accompanying documentation will make this clear to data users. Ideally the data would be accompanied by an English summary and English documentation and metadata explaining the material in detail.
- Data I collected are summary notes of focus group discussions. Are these worth archiving?
Summary data are less likely to be accepted by the Archive because of their more limited reuse potential. If you are funded by the ESRC then all data must be offered to the UK Data Service for deposit, regardless of what format they are in.
- The data resulting from my research are audio-visual recordings of interviews and focus groups and a collection of artefacts. Can these be archived?
The Archive judges each data offer on a case-by-case basis. Whilst it is preferable for research purposes that interviews and focus group discussions are transcribed (as it makes reuse of such data much easier), at times audio-visual materials are archived too. Paper-based artefacts, such as photos, postcards, family trees, could possibly be digitised. The Archive will discuss this with you when you offer the data and you should not assume a priori that there will be data that we cannot accept.
- How do I prepare my data for archiving and reuse?
Data should be prepared in such a way that they can be easily understood and used by other researchers and are well organised.
Data should be clearly labelled and documented, which means that research procedures, fieldwork methods and the context of the research are explained and that all variables, codes and fields used are self-explanatory. Data documentation can be produced as information embedded within a dataset itself. Important contextual and methods documentation may be found in a final report of a research project , in publications, working papers and lab books. See documenting your data for detailed guidance.
When data result from research with people as participants, attention may need to be given to the possible confidentiality of such data. Informed consent needs to be obtained for the data to be shared with the wider research community and data may need to be anonymised.
- Are there guidelines on creating, managing and preparing my data for sharing?
Further information is available from the UK Data Service web pages on creating and managing data. Also see our detailed create and manage FAQ covering formats, storage, consent, access conditions and copyright.
- Can you tell me more about your self-deposit repository, ReShare?
ReShare is a self-deposit system hosted by the UK Data Service. It replaced ESRC Data Store in April 2014. Its focus is the storage and sharing of primary research data from the social and behavioural sciences and currently routinely takes in data from ESRC award holders.
All forms of digital data can be deposited in and accessed via ReShare, including statistical data, databases, word documents and audio-visual materials. Contributors are required to register in order to contribute and upload materials and can assign permissions to individuals and/or groups to enable access to their materials. At the moment, only researchers holding an ESRC grant can contribute research data to ReShare .
ReShare uses the Open Source repository system based on EPrints, using the ReCollect app for data collections. We check all uploaded data to ensure they are virus free, readable and free from rights and disclosure problems.
How we curate data ↑ TOP
There are currently no FAQs in this section.
Find data ↑ TOP
- How do I find a particular data collection?
Data collections can be found through Discover, the UK Data Service's search and browse interface.
Discover allows you to find and retrieve information about: the UK Data Service's data collections; case studies which show how some of these data collections have been used; support guides describing how some of the data collections may be used; and publications and outputs associated with them.
Each data collection record provides a link to a login prompt, from where you may gain access to the data themselves or to more information about them.Each case study or support guide record provides a link to the full text/video of that record.
Other data catalogues that you can search include: ReShare (our self-archive), the Relu data portal and the European CESSDA catalogue.
- Can I get hold of any international data?
Yes, through the UK Data service we can help you locate and acquire data from other data archives around the world through reciprocal agreements with a network of social science data archives. For example, the Inter-university Consortium for Political and Social Research (ICPSR) at the University of Michigan in the USA, and members of the Council of European Social Science Data Archives (CESSDA).
Data for several key international series can be found and requested via the UK Data Service Discover Catalogue. However, some of these data collections are restricted to users at UK institutions of higher or further education. You can also search for data at individual archives via the clickable maps at Other data archives. Additionally you can search a number of European social science data archives via the CESSDA catalogue.
- Who can obtain data?
Researchers, students and teachers from any field, organisation or country may register with the UK Data Service and obtain data.
However, some data collections have restrictions on access. For example:
- census data (from 1971 onwards) are only made available to users from UK higher or further education institutions.
- commercial usage may be restricted
- permission may be required from the depositor
- publications may need to be vetted by the sponsoring organisation
Details are available in the individual records of the Discover catalogue.
- In what formats are data available?
Most survey datasets are available to download in SPSS, Stata and tab-delimited (suitable for use in MS Excel) formats and can also be requested in other formats such as SAS. Other UK Data Service systems, including the Nesstar Catalogue, provide these plus additional data formats, such as Statistica and Dbase. International multi-nation aggregate databanks are made available online via the UKDS-stat software, and tables can be downloaded in MS Excel and comma-separated formats. Qualitative data formats include MS Excel, MS Word and RTF.
- How do I gain access to the data?
Access to data requires registration and uses federated access management (shibboleth) user authentication. You will need to have a username and password to register, and further details are available from the UK Data Service - How to access data.
- How much does it cost to access data?
Data required for non-commercial purposes can be downloaded at no cost. If data are requested on portable media e.g. CD, handling and postage and packing fees will apply. See Charges for further details.
- Can I check the contents of a data collection before I obtain it?
The Discover catalogue contains a full study description for each data collection and also provides access to online documentation and variable lists. Online documentation includes user guides that contain information on how to use the data, how the data were collected, and usually the original questionnaires or topic guides. Online variable lists provide the variable names and variable and value labels.
The Nesstar Catalogue provides details for all the variables within datasets available from the Nesstar system, including the full question text, frequency counts and other summary statistics.
Access to both catalogues does not require registration. However, registration is required to conduct online data analysis or to download data.
- Can I obtain guides to using the data and questionnaires?
User guides accompanying each data collection contain information on how to use the data, how the data were collected and usually the original questionnaires. These are freely available via the Discover Catalogue and, where available, are supplied with orders/downloads. The UK Data Service Discover Variables provides access to questionnaires from a range of major UK and cross-national social surveys. A search of the questionnaires displays questions in their original context helping with questionnaire design and methods research and teaching.
- Are there any restrictions on the use of the data?
Restrictions on the use of the data are outlined in the End User Licence (EUL) that all users agree to when registering. In particular there is a fundamental restriction concerning the confidentiality of data. Users should not attempt to use the data to deliberately compromise the confidentiality of individuals, households or organisations and are required to abide by the current Data Protection Act. The EUL also covers requirements for citation of publications and safeguarding of data. Some data have additional special conditions attached to them that users must abide by. Other more sensitive data require a Special Licence to be acquired.
The sharing of data you obtain from us with other researchers or students and the reuse of data for a new purpose is restricted by the terms and conditions outlined in the EUL.
Certain data collections or types of use may also require depositor permission and details are available in the 'Access' section of each catalogue record in the UK Data service Discover Catalogue.
- What is the most detailed geographical level I can analyse the data at?
Most survey datasets contain one or more geographical variables e.g. place of residence, place of work. In many data collections the most detailed geographical variable available is a Government Office Region (GOR) variable which allows researchers to identify broad regions, for example 'South East', 'South West, 'North East', 'North West'. See Government Office Regions on the National Statistics website for more information.
Most survey participants are informed that their responses will only be passed on to researchers under certain conditions and that the data will be fully anonymised. Including more detailed geographical variables in a data collection, although still anonymised, can increase the risk of data disclosure.
However, it is recognised that some researchers need access to more detailed data, and we do have some data collections that are detailed, yet anonymised. Since these data pose a higher risk of disclosure, they have additional special conditions attached to them that take the form of a Special Licence (SL).
To find out which geographical variables are available in a particular data collection, users should consult the relevant catalogue record in the Data Catalogue and scroll down to the 'Spatial units' field within the 'Coverage' section. Variable and value labels, or further information on standard codings and where to find them, are usually available in the associated documentation (freely downloadable via the catalogue record). A 'Variables' search, for a key range of data, is available from the Data Catalogue or via the Nesstar Catalogue. Users may also find the Beginner's Guide to UK Geography on the Office for National Statistics website helpful.
- Can I obtain publications arising from data collections or re-use of data?
We are unable to supply copies of publications, other than User Guides accompanying the data. However, references to publications and journal articles produced by the data creators as well as those produced by secondary analysts are available in the Publications section for Discover records for each data collection. There are also a number of searchable databases of publications which cite particular data collections - see Publications citing Government Surveys and International data and Longitudinal data.
- Can I obtain statistics?
The survey datasets we supply are usually computer-readable data files that require specialist software, such as SPSS or Stata, to analyse. A number of survey datasets are available to most registered users to analyse and subset online via the Nesstar Catalogue, where basic frequency counts are freely available to all users. Links to sources of ready-made statistics can be found on the UK Data Service Ready-made UK statistics.
- Do you have any historical data?
Yes. the UK Data Service has over 650 data collections covering a wide range of historical topics from the fifth century to the mid-twentieth century. The primary focus of the collection is on the United Kingdom, although it also includes a significant body of cross-national and international data collections. The majority of the data is statistical relating to nineteenth and twentieth century economic and social history, particularly census data.
Alongside its data collection there are also a number of special collections, several of which are freely available online. The most important of these is histpop - the Online Historical Population Reports Website which provides online access to the complete British population reports for Britain and Ireland from 1801 to 1937.
- Can I use the data to identify individuals, households or organisations or for tracing family histories?
Unless respondents have given their permission or data are in the public domain, then data are anonymised.
When registering, users agree to preserve at all times the confidentiality of information pertaining to individuals and/or households in the data collections (where the information is not in the public domain). Also, not to use the data to attempt to obtain or derive information relating specifically to an identifiable individual or household, nor to claim to have obtained or derived such information. In addition, to preserve the confidentiality of information about, or supplied by, organisations recorded in the data collections.
Some historical data collections in the Discover Catalogue that are in the public domain may be of interest to family historians/genealogists. However, the UK Data Service is funded to preserve and disseminate electronic data created by or for historians, and genealogical research is not part of their remit.
- Can I gain access to UK census data?
You can gain access to 1971-2011 census data from the UK Data Service Census Support using your UK Data Service registration. Some data collections require extra licences to be signed. Only those studying or working within UK higher and further education, and some associated institutions and Research Council staff, are eligible to use the data.
- I have heard about Nesstar, could you tell me more about it?
Nesstar is our online data browsing tool which allows users to create simple cross-tabulations of variables, export these to Excel, view the metadata, and download subsets of data if required. Further information can be found on the Nesstar page.
- What is the CESSDA catalogue?
The CESSDA catalogue allows you to search seamlessly across a number of European social science data archives to locate data collections and variables. Data can also be browsed by topic, keyword or data publisher.
- Why do I need to cite data when I’ve used it?
Citation identifies sources for validation and further research by different researchers.
By signing our End User Licence, all users agree to cite data they have used. By establishing data collections as bibliographic entities and 'publishing' them as such, and by offering advice on citation, the UK Data Archive plays a major role in extending research and scholarship. The creation of a data collection which is properly documented and usable by other researchers deserves equivalent recognition and acknowledgement to a printed work of scholarship.
Failure to cite data means that valuable data sources will not be indexed by bibliographical services such as social science citation indexes, and, more importantly, other researchers who would like to analyse these data may not have sufficient information to acquire them.
- How do I acknowledge and cite data?
Details of the citation and acknowledgement that should be used are set out in the 'Study information and citation' file, available for every data collection from the online documentation table via the relevant catalogue record within the Discover Catalogue.
A citation should include enough information so that the exact version of the data being cited can be located, but does not include information on the sponsor or copyright. A Digital Object Identifier (DOI) is included in the citation. This ensures that even if the location of the data changes, the DOI will always link to the data that was used. Each data collection used must have a separate citation.
See our page on citing data.
- Some journals ask authors to make available the data used for a publication. How do I comply with this?
Some journals require authors to submit data alongside a publication so that the published results can be replicated by others.
Data obtained from our collection, including subsets and derived data, cannot be submitted to journals alongside publications as this would be a breach of the End User Licence (EUL) that users agree to when they register. However, in most cases it is sufficient for the author of the publication to supply information to the journal about how an individual can register and access the data.
In addition, for derived data there are a number of options available including:
- the author can supply the syntax used to the journal
- the author can offer their data to the UK Data Archive - it is a requirement of the EUL that any derived data be offered for deposit
- the author can request that anyone wishing to replicate the results should apply to the author
- before passing on the data, it is essential for the author to first check with us that any applicant is a registered user and is also entitled to access the data
News & Events ↑ TOP
There are currently no FAQs in this section.