Enrich

In order to make data open, linked and reusable it must be unlocked from the usual human readable forms of print and HTML and converted to linked, machine readable data.
TSO uses text analysis frameworks such as GATE and UIMA to enrich content automatically and extract information from that content. Enhancing content provides the granularity of mark up needed to convert to linked data. This approach works for raw text, semi-structured or structured data and can be integrated with templates to improve automation.
TSO has experts in converting data into the formats required to publish linked data and make it discoverable including:
- RDF, the recommended format for linked data
- XML
- XHTML +RDFa
- ATOM.
Read how TSO has opened up London Gazettes data.