Skip to main content

The UK's largest digital collection of social sciences and population research data

Smart Qualitative Data

Smart Qualitative Data (SQUAD) was a demonstrator project that explored methodological and technical solutions for exposing digital qualitative data to make them fully shareable, exploitable and archivable for the longer term. Defining and describing data context, XML standards, and text mining tools for anonymising data were all explored.

Initially, the project dealt with specifying and testing flexible means of storing and marking-up, or annotating, qualitative data using universal standards and technologies, through eXtensible Mark-up Language (XML). Such tools are required to exploit fully the potential of qualitative data for adventurous collaborative research using web-based and e-science systems. An example of the latter might be linking multiple data and information sources, such as text, statistics and maps. A community standard, or schema, was proposed that would be applicable to most kinds of qualitative data which might be able to function as a longer-term preservation format.

The second strand investigated optimal requirements for describing or 'contextualising' research data (e.g. interview setting or interviewer characteristics), aiming to develop standards for data documentation and ways of capturing this information.

The third strand experimented with natural language processing technologies to develop and implement user-friendly tools for semi-automating processes to prepare marked-up qualitative data.

The project furthered research tools for online publishing and exploration of enriched marked-up qualitative data and associated research materials.

Archive contribution

Staff from Qualidata at the UK Data Archive led the project. They provided the qualitative data mark-up schema, defined user needs, provided sample qualitative data and manually created XML marked-up text. Claire Grover's team at the Human Communication Research Centre in Edinburgh were responsible for developing the natural language processing toolsets, including automated XML mark-up and friendly JAVA interfaces to the mark-up tools. Both sites contributed to user testing, evaluation and documentation activities.

Principal Investigator: Louise Corti
Funder: ESRC
Dates: March 2005 - August 2006
Contact: Louise Corti