The information below relates to data available to download using the Download/order links in the data catalogue or
via Usage details in your account. (See Online guides for help with downloading data in other ways e.g. from the ESDS International
multi-nation aggregate databanks or the ESDS Nesstar Catalogue.)
Downloading your chosen format
What is a ZIP file?
Saving the ZIP file
Opening the ZIP file
Contents of the unzipped file
List of commonly occurring file extensions
Format availability and advice on format - quantitative
data
Format availability and advice on format - qualitative
data
Downloading your chosen format
Under usage details, click on the title of the
relevant usage, then click on the relevant Download button. Then click on the
appropriate format button.
Please note that if you get the message "Access Denied - Referral Block", this may be due to a particular type of
firewall your end. You should try to download the data from another computer that does not have the
same firewall or temporarily deactivate the firewall (you are advised to consult with any local computing support
before doing this).
If Internet Explorer blocks a download with a 'no entry' sign and displays a notification in the Information Bar, you can simply click the
Information Bar and select the option to allow the download. Alternatively, hold down the Control button on your keyboard
when you try to download.
What is a ZIP file?
A ZIP file allows several files to be downloaded as one file. The files are compressed so that the ZIP file is smaller than the size of the
uncompressed files, resulting in a faster download.
Saving the ZIP file
In the 'File Download' dialog box choose to save this file. Do
NOT choose to open the file as this will only create a temporary copy on your hard drive.
In the 'Save As' dialog box, next to 'Save in', choose the directory/folder where you want to save your data, make a note of the file name
and click on 'Save'.
Please note that if a dialog box does not appear, you may need to change your security settings to Medium in your
browser. In Internet Explorer, you can do this using Tools, Internet Options, Security. If this does not work, please contact
your local computer support.
If you are using Internet Explorer on a Macintosh Operating System (OS) and a dialog box does not appear, you should try using
one of the following web browsers instead -
OS9 - Mozilla (version 5.1.7 or higher) or OSX - Safari (OS supplied browser).
Opening the ZIP file
Locate the directory/folder where you saved the ZIP file and double click on it to uncompress/open it,
using decompression software such as Winzip, Pkunzip or Stuffit Expander.
Get StuffIt Expander
Extract the files to a directory/folder and opt to keep the folder names/directory structure.
Contents of the unzipped file
The top level folder is usually named UKDA[study number]-[format] (e.g. UKDA4651-spss). In this top level directory
there will usually be 2 folders, one containing the data and named according to format (e.g. SPSS, Stata, tab, rtf) and one
containing documentation and called mrdoc (short for machine readable documentation). Occasionally, there will also be a
folder called code, which will contain command files that create derived variables or aid analysis in some way.
At this top level, there may also be the following files:
- read[study number].txt (or .htm) and rd[series number].txt (or .htm) files - these contain information about processing
levels, and information to aid use of the data additional to the user guides supplied by the data depositor and the text of
any additional agreements on conditions of use. The rd[series number].txt (or .htm) file provides additional information
that relates to the entire series (e.g. every quarter of the Labour Force Survey).
- [study number]_file_information.rtf - this contains a list of the files included in the download and usually file labels to
help identify the file contents.
The mrdoc folder contains a number of other folders which contain the user guides supplied with the data by the data
depositor - these are usually in PDF format. In addition, for studies processed from April 2004 onwards, users are
supplied with a UK Data Archive data dictionary called [filename]_UKDA_Data_Dictionary.rtf. Users are also supplied with a
file containing information on how to cite and acknowledge the data in publications and a shortened version of the
information available in the online catalogue record - this file is called UKDA_Study_[study number]_Information.htm
(in some downloads it may be called cite[study number].txt).
List of commonly occuring file extensions
- txt - plain text format can be opened using any text editor or word processing software (such as Word)
- rtf - rich text format can be opened using any word processing software
- pdf - portable document files can be opened using Adobe Acrobat Reader
- htm - web/html files can be opened using a web browser or Word
- por - SPSS portable files
- sav - SPSS system files
- dta - Stata files
- tab - tab delimited text files
If you do not have Adobe Acrobat Reader, you will need to install it:
Get Adobe Acrobat Reader
Format availability and advice on format
Quantitative data
SPSS
This is the most popular dissemination format. The files supplied by the UK Data Archive (UKDA) are
in SPSS portable (.por) format for older studies, and SPSS system (.sav) format for newer studies (processed from October
2005 onwards). SPSS portable files open in all versions of SPSS and SPSS system files with all recent versions of
SPSS on all platforms. Where .sav files include variable names that have more than eight characters,
a file of the name [study number]_SPSS_varnameinfo.txt is also supplied. This file contains a lookup table of extended and
abbreviated SPSS variable names. This allows users of SPSS version 11.5 or previous (which abbreviates
long variable names) to equate the variable names in SPSS with the full variable names.
Studies processed from April 2004 onwards are also supplied with a
UKDA Data Dictionary file, which is named [data file name]_UKDA_Data_Dictionary.rtf and should be more easily
readable than SPSS data dictionaries or Stata codebooks. For an example,
see UK Data Archive Data Dictionary.
Stata
This format is increasing in popularity, and is recommended for surveys that require weighting and other survey design
effects to be incorporated into their analysis. Studies are made available in Stata 6 format, and, for studies processed
from October 2005 onwards, in Stata 8 format. If one or more data files has more than 2,047 variables (the limit for
version 6 and the 'intercooled'
versions 7 and 8), the data files are (from April 2004 onwards) made available in Stata version 8 Special Edition format.
Versions are indicated in the names of the zipped download bundle i.e.
[study number]stata6 (version 6), [study number]stata8 (version 8), and [study number]stata8se (version 8 special
edition).
Stata data handling limits are generally slightly less generous than SPSS, so some loss or truncation of information,
such as variable and value label loss or truncation and loss of user missing value definitions, is inevitable. The UKDA has
developed its own scripts to guarantee optimal translation of data between SPSS and Stata - for more information, see
Data Management on the ESDS web site. For studies
processed from April 2004 onwards, the file [Study Number]_SPSS_to_STATA_conversion.rtf will be supplied with the data.
This provides a log of any information that has been lost or truncated upon translation.
For an example, see UK Data Archive SPSS to Stata Conversion
Information File.
Users can then locate the full label and user missing value information in the UKDA's data dictionary files,
named [data file name]_UKDA_Data_Dictionary.rtf. For an example,
see UK Data Archive Data Dictionary.
Tab-delimited text
This is an entirely generic format that stores just the variable names and the rectangular matrix of data (there is no
information on variable formats, label information or missing value definitions). The character set is normally ASCII but
may be UNICODE.
ESDS recommends ordering data in tab-delimited format where this is the most effective means of reading the data into a specialist
analysis package. When data are supplied in tab-delimited format, data dictionary or database structure information will also be provided.
Depending on the application from which the tab-delimited data were created, these file will either be named:
[data file name]_variableinformation.rtf or [data file name]_UKDA_Data_Dictionary.rtf
Although tab-delimited format is suitable for use in Excel 2003, the maximum number of columns (variables) is 256 and
the maximum number of rows (cases) is 65,536. However, Excel 2007 supports 16,384 columns (variables) and 1,048,576 rows (cases).
SAS
Due to limited demand, data are not routinely made available in SAS format by the UKDA. However, the UKDA will create SAS
formats upon request. The standard method of delivery of SAS datasets is as a fixed with ASCII file with a .sas command
file to read the data into SAS and create the formats library. This means that a SAS dataset can be created in any almost
any version of SAS running on any operating system. Users can also decide whether to preserve the SPSS 'user missing' codes
or collapse them into the SAS system missing code, since this is supplied as a discrete block of commands in the .sas file,
under the heading /* User Missing Value Specifications */
R
This open source variant of S-Plus is gaining in popularity as it offers advanced functionality not present in SPSS or in
some instances even Stata. However, the ESDS does not make data available in R format, since R will read both SPSS and
Stata formats (using the 'read spss' and 'read.dta' commands).
Other formats
Occasionally, datasets are not suitable for SPSS and Stata. This occurs when, for example, unstructured or
semi-structured interviews record literal textual responses of greater than 255 characters. While packages such as Excel
and Access can store these long strings, statistical packages (like SPSS prior to version 13 and Stata) cannot. In these
cases, the data are made available in format(s) that do not truncate these long strings. This will typically be a choice of
a proprietary format (e.g. MS Excel or Access) and tab-delimited text. MS Access 'data documenter' information is provided
for each table when data are extracted from an MS Access database.
Qualitative data
Rich text format (.rtf)
Rich text format is used for the majority of qualitative studies. Rich text format files will open into most text editors
and almost all word processing packages.
Portable document format (.pdf)
PDF format is used when data were only available to the ESDS as hard copy (paper) and the level of ESDS data processing
did not permit Optical Character Recognition to convert this into text. In such instances the hard copy material is
scanned into image files (400 dpi TIFFs) and these are converted into PDF format.
CAQDAS format
Where qualitative data have been coded and analysed using Computer Assisted Qualitative Data Analysis Software (CAQDAS),
these files may also be supplied, in addition to the 'raw' transcripts in rich text format.