What are the benefits of opening up your data? And where do you start?

Thursday, 11 September 2014, 12:09.

The transparency benefits of opening up public sector data are clear – data on school results, hospital waiting times and government expenditure all drive a focus on improvement. But to go further Sir Tim Berners-Lee recommends opening up that data as linked data which is both human readable and machine readable and set out a 5 star open data ranking where one star represents making data available on the web under an open licence and 5 star data is structured, in non-proprietary formats, using URIs and published as linked data. 

5 star open data makes information more useful by enabling relationships to be expressed, making it possible to move between data sets and enable more user-friendly products to be created from that data – such as the police crime map, and Land Registry house price heat maps.

In the webinar Pete Davis, TSO’s Enterprise Architect talked about some of the techniques that help to turn data into 5 star data: using URIs, enriching data and transforming data into RDF. Clear, human readable URIs are important to point to data, making it linkable and shareable and ensuring that the information is persistently available. TSO’s Data Enrichment Service (DES), part of the OpenUp® platform, can identify entities such as names, dates and places within plain text, enabling you to identify useful information and apply structure to unstructured information. It then links those entities to other sources of useful information and can transform that information into RDF linked data. You can try it for yourself at http://openup.tso.co.uk/des by pasting any plain text in and seeing what it identifies when you press submit.

Holly Ellis talked about www.thegazette.co.uk which is one example of where 5 star data is put to good use. The Gazette is an official public record used for legal purposes and containing hundreds of years of useful content. In 2013 TSO completely redeveloped the website and publishing platform to make it easier for people to place notices on the official public record and for the people who use the data to choose how they access it. TSO’s DES technology has been used to make it really easy for people who don’t place notices frequently to submit plain text. The DES identifies the required notice elements within the plain text, such as the effective date, and tags that information so that it is searchable on the website. Publishing the content as data has made it much more searchable, re-usable and valuable with the data in the notices linked to other useful sources of data such as Ordnance Survey maps and Companies House data. In a ground breaking move, The Gazette publishes the provenance trail of the published data as linked data, providing complete transparency of every action that happens to the data on its journey from submission to publication to guarantee authenticity.

Peter Camilleri talked about other examples of where 5 star data has been used including the www.legislation.gov.uk website which was the first linked data statute book in the world and is now updated using linked data principles; the ONS linked data portal; the Environment Agency catchment data explorer and the British Library British National Bibliography.

For those thinking about opening up data TSO offers the following advice:

■   Make your data structured from the start – capturing information in a structured way from the outset, introducing standardised creation techniques and validating that structure will help to make it more discoverable later on.

■   Find the information hidden away in unstructured content – enrich your data to enable useful information to be extracted and linked to other useful information sources.

■   Encourage re-use of your information – use a URI structure that makes it easy for people to find and share your information and aspire to 5 star open data, published as RDF under an open licence that will enable a wider audience to benefit from your valuable data.

