Archive

Posts Tagged ‘linked data web’

Introducing RAGLD

December 21, 2011 1 comment

RAGLD (Rapid Assembly of Geo-centred Linked Data) is a project looking at the development of a software component library to support the Rapid Assembly of Geo-centred Linked Data applications

The advent of new standards and initiatives for data publication in the context of the World Wide Web (in particular the move to linked data formats) has resulted in the availability of rich sources of information about the changing economic, geographic and socio-cultural landscape of the United Kingdom, and many other countries around the world. In order to exploit the latent potential of these linked data assets, we need to provide access to tools and technologies that enable data consumers to easily select, filter, manipulate, visualize, transform and communicate data in ways that are suited to specific decision-making processes.In this project, we will enable organizations to press maximum value from the UK’s growing portfolio of linked data assets. In particular, we will develop a suite of software components that enables diverse organizations to rapidly assemble ‘goal-oriented’ linked data applications and data processing pipelines in order to enhance their awareness and understanding of the UK’s geographic, economic and socio-cultural landscape.A specific goal for the project will be to support comparative and multi-perspective region-based analysis of UK linked data assets (this refers to an ability to manipulate data with respect to various geographic region overlays), and as part of this activity we will incorporate the results of recent experimental efforts which seek to extend the kind of geo-centred regional overlays that can be used for both analytic and navigational purposes. The technical outcomes of this project will lead to significant improvements in our ability to exploit large-scale linked datasets for the purposes of strategic decision-making.RAGLD is a collaboative research initiative between the Ordnance Survey, Seme4 Ltd and the University of Southampton, and is funded in part by the Technology Strategy Board‘s “Harnessing Large and Diverse Sources of Data” programme. Commencing October 2011, the project runs for 18 months.

If you’d like to input into the requirements phase of the project I’d be very grateful if you could fill in one of these questionnaires. Many thanks in advance.

So what can I do with the new Ordnance Survey Linked Data?

October 25, 2010 7 comments

In a previous post I wrote up some of the features of the new Ordnance Survey Linked Data. In this blog post I want to run through a concrete example of the sort of thing you can build using this linked data.

A while ago Talis built their BIS Explorer. The aim of this application was to allow users to “identify centres of excellence at the click of a button” and more can be read about the application here. This data mash-up took different data sources about funded research projects and joined them together using linked data. In the original application you could, for example, look at funded research projects by European Region in Great Britain. This can be seen here. At the time this demo was created Ordnance Survey was yet to publish its postcode data as linked data, but if they had it would have been very easy to get a more fine grained view of research projects down at the county and district level. Here’s how…

The basic data model of the original BIS data was fairly straight forward. Universities and businesses have a link to the projects they worked on. For each university there is also postcode information. Things get interesting if instead of/as well as linking to a string representation of a postcode you link to the URI for said postcode. This can be done by using the property:

http://data.ordnancesurvey.co.uk/ontology/postcode/postcode

So say we wanted to do this for Imperial College. All we need is this (this example is in N-Triple format) in our data:

<http://education.data.gov.uk/id/institution/ImperialCollegeOfScienceTechnologyAndMedicine>

<http://data.ordnancesurvey.co.uk/ontology/postcode/postcode>

<http://data.ordnancesurvey.co.uk/id/postcodeunit/W68RF> .

Now, by the power of linked data, connecting to a resource for the postcode means we can now enrich our university dataset with knowledge of the ward, county and district the university is in. Also, given that the university is connected to a project we have a link from project to region. Through the link from project to university to postcode to region we can now start to have a more finely grained view on which areas are getting more funding.

So how do we do this in practice? There are the steps I followed.

  1. Download the BIS data from here and load it into a triple store (linked data database) of your choice. There are plenty of good open source ones available e.g. Sesame or TDB to name two.
  2. I then then added the links to postcode URIs as described above.
  3. Following that I loaded the data for the postcodes in a similar manner to that described here. A relatively simple script retrieved the RDF for the relevant postcodes and loaded the RDF into my store. The nice thing about linked data and RDF is that the stores are like a big bucket of data and you can keep throwing more and more in. Hopefully future linked data tools will make this step trivial, but for now some scripting was required.
  4. Job done. I now have links from research projects to regions.

Basically what I created from this was an aggregation of various datasets that you can now query. This is something that is made very easy using linked data and URIs to identify things like postcodes. As more publishers release data in linked data form there is more and more potential for building services and applications on top of aggregations of these datasets.  So that’s what I decided to do…

This application (I make the usual apology for my lack of web development skills and for the slowness which some caching would not doubt sort out) builds a clickable map view of this data aggregation. The OS OpenSpace API makes it possible to retrieve the unit ID for selected polygons. I can then use this unit ID in a SPARQL query to find the projects funded in that region.

However, it would have been easier if there was a RESTful API on top of the data aggregation that would have let me retrieve these results instead of doing some SPARQL. So that is what I decided to build next using the Linked Data API. The linked data API basically lets you create RESTful type shortcuts to relatively complex SPARQL query. Due to my lack of PHP skills it was an initially bumpy ride getting it to work (see here) but I got there in the end and the result was an API that lets you return research projects by selecting regions either through their SNAC codes or Ordnance Survey IDs, e.g:

http://www.johngoodwin.me.uk/bis/api/project/county/os/{unit id},

e.g. http://www.johngoodwin.me.uk/bis/api/project/county/os/17765

 

results can be returned in different formats using content negotiation [1] or by simple adding the relevant .html, .json to the URI, e.g.:

http://www.johngoodwin.me.uk/bis/api/project/euro/os/41424.html

http://www.johngoodwin.me.uk/bis/api/project/euro/os/41424.json

I hope this example shows how linked data can be useful in building applications on top of data aggregations. To summarise:

  1. Publishers release data in linked data format.
  2. Having data in a common format (RDF) with dereferencable URIs makes it relatively each to retrieve and aggregate from a number of resources, especially if data is linked to URIs for ‘things’ and not just ‘strings’.
  3. The linked data API makes it possible to build a RESTful service on top of a data aggregation so web developers need not be put of by complex SPARQL queries.
  4. Applications can then built using these services.

[1] for some reason the HTML conneg only seems to work in Firefox.

/location /location /location – exploring Ordnance Survey Linked Data – Part 2

October 25, 2010 5 comments

Ordnance Survey have now released an update to their linked data, which can be seen here. The new data now includes postcode information as well as a few changes to the administrative geography data. In this post I’ll go through what’s in the data, and give a few sample SPARQL queries.

I spoke a bit about the administrative geography data in a previous blog post – but the data has changed a bit since then. Just to re-cap the administrative geography linked data contains information about administrative and voting geographic regions. These include unitary authorities, counties, wards, constituencies, Welsh Assembly regions and a whole lot more [1]. Here are some examples:

If you want to find a full list of the sorts of thing you can find in the data simply go to the query interface (or SPARQL endpoint as it is know) and try the following query:

select distinct ?type

where { ?a a ?type . }

Now you have the list all of type of things in the data you can as for lists of instances of those types.

For example, the following query will return all of the unitary authorities:

select ?a

where {

?a a <http://data.ordnancesurvey.co.uk/ontology/admingeo/UnitaryAuthority&gt; .

}

All of the names of all the regions have now been modelled using the SKOS vocabulary. If you want to find the official names of all the unitary authorities you can simple issue a query like:

select ?a ?name

where

{

?a a <http://data.ordnancesurvey.co.uk/ontology/admingeo/UnitaryAuthority> .

?a <http://www.w3.org/2004/02/skos/core#prefLabel&gt; ?name  .}

Also included in the data are two attributes called Unit ID and Area Code. These values are useful if you want to produce a mashup using this data and display it by boundary.

So for example, for Southampton (http://data.ordnancesurvey.co.uk/id/7000000000037256) the area code is UTA (for unitary authority) and the unit ID is 37256. These values can be used as follows:

/*here we set-up the our variable called ‘boundaryLayer’ with the strategies that we require. In this case, it is its ID and type i.e. Unitary Authority */

boundaryLayer = new OpenSpace.Layer.Boundary(“Boundaries”,

{ strategies: [new OpenSpace.Strategy.BBOX()], admin_unit_ids: ["37256"], area_code: ["UTA"] });

//then we add the bounadry to the map osMap.addLayer(boundaryLayer);

//this effectively refreshes the map, so that the boundary is visible

osMap.setCenter(osMap.getCenter());

to display the Southampton boundary using the OS OpenSpace API. See http://openspace.ordnancesurvey.co.uk/openspace/support.html for more details.

Arguably the most useful information in this data are the qualitative spatial relationships between different regions. Regions are related to the regions they contain, they are within and they touch. In the case of the touching relationship only regions of the same type have an explicit touching relationship. The exception to this are unitary authorities, counties, district and metropolitan district that also have touching relationships between each other. The following simple query will return a list of all counties, districts and unitary authorities that border The City of Southampton. It will also return their names:

PREFIX spatialrelations: <http://data.ordnancesurvey.co.uk/ontology/spatialrelations/&gt;

select ?a ?name

where

{

?a spatialrelations:touches <http://data.ordnancesurvey.co.uk/id/7000000000037256&gt; .

?a <http://www.w3.org/2004/02/skos/core#prefLabel&gt; ?name  .

}

If you are only interested in the bordering counties you can add an extra line to your query:

PREFIX spatialrelations: <http://data.ordnancesurvey.co.uk/ontology/spatialrelations/&gt;

select ?a ?name

where

{

?a spatialrelations:touches <http://data.ordnancesurvey.co.uk/id/7000000000037256&gt; .

?a <http://www.w3.org/2004/02/skos/core#prefLabel&gt; ?name  .

?a a <http://data.ordnancesurvey.co.uk/ontology/admingeo/County> .

}

Similarly, the following query returns all the county electoral divisions (and their names) within Hampshire:

PREFIX spatialrelations: <http://data.ordnancesurvey.co.uk/ontology/spatialrelations/&gt;

select ?a ?name

where

{

?a spatialrelations:within <http://data.ordnancesurvey.co.uk/id/7000000000017765&gt; .

?a <http://www.w3.org/2004/02/skos/core#prefLabel&gt; ?name  .

?a a <http://data.ordnancesurvey.co.uk/ontology/admingeo/CountyElectoralDivision> .

}

For convenience some shortcuts have been added to the data in this release. For certain nesting geographies, such as the county – district – parish or district – ward nestings, various new properties have been added. For example, the property ‘counyElectoralDivision‘ relates all counties to their constituent county electoral divisions. The above query can now be done in a simpler way:

PREFIX admingeo: <http://data.ordnancesurvey.co.uk/ontology/admingeo/&gt;

select ?a ?name

where

{

<http://data.ordnancesurvey.co.uk/id/7000000000017765&gt; admingeo:countyElectoralDivision ?a .

?a <http://www.w3.org/2004/02/skos/core#prefLabel&gt; ?name  .

}

Similar predicates such as ‘county‘, ‘district‘, ‘ward‘, ‘constituency‘ etc. provide similar shortcuts. For example, the following returns all the Westminster constituencies in South East England.

PREFIX admingeo: <http://data.ordnancesurvey.co.uk/ontology/admingeo/&gt;

select ?a ?name

where {

<http://data.ordnancesurvey.co.uk/id/7000000000041421&gt; admingeo:westminsterConstituency ?a .

?a <http://www.w3.org/2004/02/skos/core#prefLabel&gt; ?name  . }

The most significant introduction to this data is the inclusion of postcode information. The data now contains information about postcode units, postcode sectors, postcode districts and postcode areas. For each postcode unit an easting/northing coordinate value is given [2] along with the district, ward and county (where applicable) that contains said postcode unit. An example of this can be seen for the Ordnance Survey postcode SO16 4GU. Each postcode is also related to its containinb postcode area, sector and district.

The properties ‘ward‘, ‘district‘ and ‘county‘ relate a postcode to the relevant regions. The simple query:

PREFIX postcode: <http://data.ordnancesurvey.co.uk/ontology/postcode/&gt;

select ?district

where {

<http://data.ordnancesurvey.co.uk/id/postcodeunit/SO164GU&gt; postcode:district ?district .

}

returns the unitary authority that contains the postcode SO16 4GU.

This query:

PREFIX spatialrelations: <http://data.ordnancesurvey.co.uk/ontology/spatialrelations/&gt;

select ?postcode

where

{

?postcode spatialrelations:within <http://data.ordnancesurvey.co.uk/id/postcodearea/SO&gt; .

}

returns all the postcodes in the SO postcode area.

We can combine the above two queries to find the areas, along with their names, covered by the postcode area SO:

PREFIX spatialrelations: <http://data.ordnancesurvey.co.uk/ontology/spatialrelations/&gt;

PREFIX postcode: <http://data.ordnancesurvey.co.uk/ontology/postcode/&gt;

select distinct ?district ?name

where

{

?postcode spatialrelations:within <http://data.ordnancesurvey.co.uk/id/postcodearea/SO&gt; .

?postcode postcode:district ?district .

?district <http://www.w3.org/2004/02/skos/core#prefLabel&gt; ?name  .

}

Hopefully these few examples will give you enough information to fully explore this new release of the Ordnance Survey linked data. For those of you who don’t like SPARQL watch this space – hopefully we will soon(ish) have an API built on top of this data to allow for even easy access.

[1] you’ll notice the ‘isDefinedBy’ link currently returns a 404 – not for long I hope :)

[2] lat/long to follow

Some quick linked data hacks

June 16, 2010 22 comments

In previous posts I discussed the work I’d been doing on my family tree linked data. I decided it might be interesting to plot places of birth for my ancestors on a map to get a true idea of where they all came from. The result, a faceted browser that lets me filter based on family name or birth place, can be seen here. This mashup was very easy to achieve using linked data and a tool called Exhibit. To quote: “Exhibit lets you easily create web pages with advanced text search and filtering functionalities, with interactive maps, timelines, and other visualizations…”.

As I explained in a previous post the places of birth for family members were recorded in my family tree linked data by linking to place resources in DBpedia, for example: http://www.johngoodwin.me.uk/family/event1917. In order to perform the mashup I need lat/long values for each place of birth. One option might have been to do some kind of geo-coding on the place name using an API. However, I didn’t relish the world of pain I’d get from retrieving data in some arbitrary XML format or the issues with ambiguities in place names. The easiest way to get that information was to enrich my family tree data by consuming the linked data I’d connected to. This is how I did it…

First I ran a simple SPARQL query to find all the places referenced:

select distinct ?place
where {?a <http://purl.org/NET/c4dm/event.owl#place&gt;
?place .}

(match on all triples of the form ?a <http://purl.org/NET/c4dm/event.owl#place&gt; ?place, and then return all distinct values of ?place).

The results are URIs of the form http://dbpedia.org/resource/Luton. I then used CURL (a command line tool for transferring data with URL syntax) to retrieve the RDF/XML behind of the URIs:

curl -H “Accept: application/rdf+xml” http://dbpedia.org/resource/Luton

This basically says give me back RDF/XML for the resource http://dbpedia.org/resource/Luton. It was then easy to insert this RDF/XML into my triplestore (RDF database). I can do this because my family tree data was in linked data format (RDF) and linked to an existing resources also in RDF – so there was no problem with integrating data in different schemas/formats.

Now all I had to do was retrieve the information I needed to do the mashup. This was done using a SPARQL query:

select ?a ?name ?familyname ?birthdate ?birthplacename ?latlong
where
FILTER langMatches( lang(?birthplace), “EN” )
}
ORDER BY ?birthdate

Given that Exhibit works really well with JSON I opted to return the results to the query in that format (SPARQL queries are typically returned as XML or JSON). It was then a simple matter of making the resultant JSON into a suitable form that Exhibit can process.

I did another simple mashup using the BBC linked data here. This followed a similar process, except that the BBC had already enhanced there data by following links to DBpedia. This BBC mashup basically lets you find episodes of brands of radio show that play your favourite artists/genres. The BBC data contains links between artists and radio shows. There are ‘sameAs’ links from the BBC artist data to DBpedia. It is DBpedia that then provides the connection between artists and their genre(s).

Hopefully this shows the power of linked data in a simple way. There is a simple pattern to follow…

1) Make data, and make that data available in RDF. People can then link to you, and you can link to other people who have data in RDF. So I made family tree data in RDF, the BBC made music/programme data in RDF.

2) Link to linked data resources on the web (in this case we both linked to DBpedia).

3) Enhance your data by consuming the data behind those links – this is trivial because they are both in the linked data format RDF.

4) Make something cool/useful :)

In fact this will be even easier to build useful services when the linked data API is in use as this will bypass the need for SPARQL in the many cases. As more and more people provide linked data we will have an easy way to provide services built on top of combined data sources, and the linked data API will make it web 2.0 friendly for those (understandably?) put off by SPARQL.

Genealogy and the Semantic Web 2

April 18, 2009 2 comments

I’ve been busy converting my parents hard work on their  family tree into RDF. I blogged about initial attempts here. It’s far from finished, but at around 500,000 triples already it looks like it’s going to be a lot of RDF!

You can view the RDF (as it is) here, but seeing as RDF is for machines a more human friendly version can be browsed here. So far I’ve been concentrating on linking places of death and birth to various other datasets include geonames, DBpedia, Freebase and Ordnance Survey (though there still a fair few places to link).

To be done:

1) Finish connecting all the places.

2) Sort date formats out.

3) Turn into linked data with dereferencable URIs and content negotation.

A more detailed write up when it’s all finished…

Reblog this post [with Zemanta]

SPARQL your way to a Stupor

February 15, 2009 4 comments

A few years back two intrepid explorers set out to survery all of the pubs in Southampton. Their adventure is documented here

A while back I decided to turn their page into linked data and you can see the result here. Mapping is provided by OS OpenSpace as the cartography is far nicer than that of Google or Yahoo! (IMHO of course :)). As of today the site is linked up to Revyu (in both the RDF and HTML) where applicable. You can also now browse the RDF using the OpenLink Data Explorer, Zitgist or Tabulator. No SPARQL endpoint as yet, but maybe one day.

Reblog this post [with Zemanta]

Web 3.0 and Social Networks

January 25, 2009 7 comments
Icon for the FOAF (Friend of a Friend) project...
Image via Wikipedia

It is probably fair to say that FOAF is where the social web meets the semantic web. FOAF, which has been around for a while now, basically creates a machine readable graph of the sort of information you might include on sites like facebook, myspace etc. Your FOAF file can include links to people you know, your interests and other personal information. It is probably also fair to say that FOAF files were, until now, the sole property of the geek. However, this has changed, and a number of social networking sites such as livejournal, identi.ca and friend feed build FOAF files from your profile information (are there others?). At least now you don’t need to know how to edit RDF in order to have your own FOAF file. Despite that, these profiles are limited by the features offered on the respective sites.

Recently though QDOS launched a new service that makes FOAF profiles extremely easy to build. This service allows uses to create a FOAF profile generated from information contained in your last.fm, livejournal and flickr profiles as well as importing existing FOAF files. You are then given the option to manually enter other information. Furthmore, you can create a public and private view of your FOAF file. I would not recommend including information like your address, phone number or date of birth in a public FOAF file.  So what are you waiting for – go building yourself a FOAF file and join the linked data web.  My FOAF profile can be found here (my original one is maintained here).

For any linked data geeks one other interesting thing about the QDOS FOAF builder is that it has started linking music data from last.fm to the new music linked data service from the BBC. Hopefully this will be just the beginning and we’ll see links to other linked data services from DBpedia, geonames and Ordnance Survey.

Reblog this post [with Zemanta]

LODr

October 24, 2008 1 comment

LODr is a new semantic web application that allows you to convert your web 2.0 tags from various sites like Flickr to semantically enriched web 3.0 URIs. For example, say you took a photo of a place in Southampton, uploaded it to your Flickr account and gave it a tag “southampton”. LODr lets you connect the tag “southampton” to a URI on the semantic web that represents the entity Southampton. In this case I chose to link up to the Southampton represented in the Ordnance Survey administrative geography ontology. Other tags can be linked to URIs in DBpedia, geonames or if music is your thing the new BBC music beta (semantic) web pages.

I guess initially this will appeal mainly to semantic web geeks, but it will be interesting to see what sort of mash-ups this generates as more and more tags are connected to URIs.

My LODr page can be found here.

More more information on LODr see here.

Follow

Get every new post delivered to your Inbox.

Join 2,092 other followers