Qualibank update

It has been six months since we launched our beta version of QualiBank with 6 open collections. Since then we have time to examine feedback and publish more data collections.  We are happy that the system is fast and reliable. I would especially like to thank my colleague Darren Bell who has been the technical lead and mastermind on this project, and who translated our early vision for this system into a reality. He has enabled the UK Data Service to embrace XML databases, improve our Solr indexing methods and adopt a fine-grained citation approach.

The UK Federated authentication was added in August which has enabled us to publish a number of collections for which users must login with their local institution passwords (and be registered users of the UK Data Service).

As a reminder, some of the open data highlights include:

  • British Archive of Political History – a pilot project conducted in 1979-1980 aiming to collect a systematic oral archive of interview with key figures from politics, the civil service and the armed forces. Interviewees were selected predominantly former Permanent Secretaries or former Cabinet Ministers, and were asked questions relating to the whole period of their official life, concentrating on the post-war years.
  • Morale and Home Intelligence Reports – government documents containing rich descriptions of attitudes and opinions of soldiers and UK public during WWII. focusing on various topics including confidence in commanding officers, the government and the war efforts, postal censorship, leisure and entertainment, and personal finances and leave.
  • The Edwardians – the much-loved life story collection that examined family life and work experience in the early twentieth century in Britain and contains over 450 oral histories of individuals from this time period, drawn from a working class point of view
  • School Leavers Study – a collection of 141 essays written by school children from the Isle of Sheppey in 1978 where they were asked to imagine that they were 60 and write a short, reflective account of their life.

The Digital Futures project, from which our QualiBank was developed, came to a formal close in April. There is some final work to do in streamlining the publishing process, as it currently involves steps that require developer intervention. We would like to try to automate the workflow as much as possible, though in creating compliant XML documents, there is typically some bespoke programmatic work to do, like converting text formats and so on.  A technical paper is being drafted on the system, and also guidance for preparing and processing data into QualiBank, which will be added shortly to our UK Data Service qualitative data processing procedures. Finally we are excited that our proposed special edition journal comprising papers from users of our collections and those involved in the world of enhanced publications is currently in progress, with an expected publication date of mid-2015.

Advertisements

UK QualiBank launched

In April 2014 we launched the first iteration of QualiBank – our qualitative data searching and browsing system that we have been developing under the UK Data Service’s Digital Futures project.  This contained 5 open collections. In June we launched a second version with collections behind our authentication system.

QualiBank, discover.ukdataservice.ac.uk/QualiBank, is the UK Data Service’s search and browse interface for qualitative data objects allowing searching of the content of text files, such as interviews, essays, open ended questions and reports. It also allows searching of metadata attached to these objects, such as a description of an image or of an audio recording, and it enables hyperlinking to related objects. A citation can be made for a whole object or interview extracts, and this is quite an exciting development that has been praised by the data citation community.  

We have a user guide coming, a technical paper on its development to follow and a streamlined data publishing system currently under construction.  In the coming months we will also be releasing various technical components of the system for use by others including a metadata entry tool and a load and validating package.  Please contact us if you would like the such as the metadata schemas (DDI, QuDEx and TEI) or the Solr queries we use.

Many thanks to those of who contributed collections and to those who helped us get this project on the road!

Progress

We have now digitized 17 collections, amassed from 8 different institutions.  We are currently preparing the digital data to be fed through our new online system and entering metadata into the QuDEx system. We are still open to considering new collections for digitization.  If you have a use case, please get in touch at mahaak@essex.ac.uk.

…..

Louise Corti recently gave a presentation on the Digital Futures project at University of Lausanne.  We also have a webinar coming up, called ‘Introducing the Digital Futures system: browsing and citing qualitative data online’.  This is expected to take place in December; registration details to come.

Posted by: Maureen Haaker

New Collections Added!

We’ve added new collections to the list!  We’re excited to be working with Mass Observation Archive on the digitisation and enhancement of the belonging directive as well as with the LSE Archives on the digitsation of select transcripts from the British Oral Archive of Political and Administrative History.

Keep checking back for more updates on the development of QuDEx and the online data delivery site.

Posted by: Maureen Haaker

Implementing QuDEx metadata for the Digital Futures project

In April we began testing the UK Data Archive’s qualitative XML metadata schema, QuDEx, using a number of test data collections. QuDEx is designed to work with the Data Documentation Initiative (DDI) and enables powerful markup of data at the object and sub-object level.  In addition to rich descriptive capabilities, the strength of QuDEx is also enabling relationships to be created between files, parts of files and annotations, so, for example, between an interview transcript, its audio recording, a text excerpt and a related code and memo. A webinar presentation on our use of QuDEx was given to a number of European archives in June 13. Thanks to my technical colleagues, Darren Bell at the UK Data Archive and Agustina Martinez at Liverpool John Moore’s for contributing to this.

We have developed a basic metadata entry tool for describing objects (mostly interview files, some images) in each collection and have agreed on a simple Text Encoding Initiative (TEI) template for producing textual transcripts. These are being stored in BaseX, an XML database. This has been a learning curve for us, as typically we work with relational databases here at the Archive.

We are building the qualitative data browsing system using our existing .NET technology and infrastructure used for our UK Data Service Discovery portal. We have four test collections, comprising in-depth interview transcripts, open-ended survey questions, annotated questionnaires in PDF format and images.

We will be holding another webinar in late October/November to showcase the beta Digital Futures system. Come and find out more there!

New Collection Digitised

We are finishing up the digitisation of Allan Silver’s classic study of “Angels in Marble”, a study of Working Class Conservatives in Urban England and moving on to new collections! So far we have digitised Pahl’s School Leavers Study and reprocessed and prepared XML text for three interview collections:

Also in queue for digitisation and enhancement are the two interview studies:

Next week we will be visiting the National Archives to select a sample for digitisation from the array of treasury-tagged, WWII Morale Reports. These government reports are written in a narrative fashion and contain rich descriptions of attitudes and opinions of soldiers and civilians during the war. For more information about these reports, please visit TNA’s Discovery catalogue. We are also working with planning the Modern Records Centre to digitise Charles Critcher’s “Coal and Community” and Richard Brown’s “Shipbuilding on Tyneside”. If you are interested in the re-use of any of this data, please get in touch with us at mahaak@essex.ac.uk.

We have also made huge progress on the development of the Digital Futures website.  We are reviewing the prototype browsing system and finalizing search facets. The underlying metadata schema, QuDEX is proving to be powerful and highly adaptable. The new website is set to be launched in Spring of 2014.

Posted by Maureen Haaker

Welcome to the Digital Futures Blog

What is Digital Futures?

In order to access primary qualitative data now and in the future, we need data to be accurately, richly and contextually described. The current challenge is thus to make data open, accessible, citable and linkable, via the medium of the internet.

The Digital Futures project, funded by ESRC, encompasses two strands of work which cover requirements gathering, system design, implementation and testing, preparation of enhanced qualitative data to populate the system, and user outputs making use of the linked scholars.

In the first strand we are developing a user-friendly system for publishing and exploring qualitative data online, to be used by the UK Data Service, and potentially offering opportunities for publishing qualitative data for other data services internationally.

The second strand is enhancing existing data, currently in non-digital format, working within a user-driven framework guided by linked scholars and teachers who will be actively working on some of the data.  Enhanced data, such as digitized open-ended questions from surveys and transcribed interviews will be published into the online delivery system developed in the first strand of work.

We will be updating this site periodically with progress of the digitization and data enhancement of the Digital Futures collection, as well as with updates on our metadata approach and technical developments.

If you want to get in touch with us regarding the reuse of data from this collection, please contact mahaak@essex.ac.uk.

More updates to come!

Posted by Maureen Haaker and Louise Corti