Home > linked data, Semantic Web > Federating SPARQL Queries Across Government Linked Data

Federating SPARQL Queries Across Government Linked Data


SPARQL 1.1 introduces the idea of federated SPARQL queries – this enables you to execute part of your SPARQL query against a remote SPARQL endpoint. I thought I’d provide some examples of using this feature in government linked open data.

The Environment Agency has published a number of its open data offerings as linked data which you can explore here. One of these datasets is the Bathing Water Quality Data, and you can explore this via their SPARQL endpoint. I won’t go into this data in too much detail as it is not my area of expertise. The Environment Agency has created 5-star open data by linking their data to both Ordnance Survey and Office of National Statistics linked data. Look at linked data for the Eastoke bathing water site and you’ll see it linked to Havant and Hampshire in the Ordnance Survey data. A relatively straight forward SPARQL query will get you  a list of bathing waters, their name and the district they are in:

select ?x ?name ?district
where {
?x a <http://environment.data.gov.uk/def/bathing-water/BathingWater> .
?x <http://www.w3.org/2000/01/rdf-schema#label> ?name .
?x <http://statistics.data.gov.uk/def/administrative-geography/district> ?district .}

Now suppose we just want a list of bathing water areas in South East England – how would we do that? This is where SPARQL federation comes in. The information about which European Regions districts are in is held in the Ordnance Survey linked data. If you hop over the the Ordnance Survey SPARQL endpoint explorer you can run the following query to find all districts in South East England along with their names (please see a previous blog post for information about simple spatial queries):

select ?district ?districtname
where
{?district <http://data.ordnancesurvey.co.uk/ontology/spatialrelations/within>+
   <http://data.ordnancesurvey.co.uk/id/7000000000041421> .
  ?district <http://www.w3.org/2000/01/rdf-schema#label> ?districtname .}

Using the SERVICE keyword we can bring these two queries together to find all bathing waters in South East England, and the districts they are in:

select ?x ?name ?districtname
where {
?x a <http://environment.data.gov.uk/def/bathing-water/BathingWater> .
?x <http://www.w3.org/2000/01/rdf-schema#label> ?name .
?x <http://statistics.data.gov.uk/def/administrative-geography/district> ?district .
SERVICE <http://data.ordnancesurvey.co.uk/datasets/boundary-line/apis/sparql>
{ ?district <http://data.ordnancesurvey.co.uk/ontology/spatialrelations/within>+
   <http://data.ordnancesurvey.co.uk/id/7000000000041421> .
   ?district <http://www.w3.org/2000/01/rdf-schema#label> ?districtname .}
}
order by ?districtname

Now supposed we want to know the sediment types of the bathing waters in Havant. We can find this with the following query:

select ?x ?name ?sediment
where {
?x a <http://environment.data.gov.uk/def/bathing-water/BathingWater> .
?x <http://www.w3.org/2000/01/rdf-schema#label> ?name .
?x <http://statistics.data.gov.uk/def/administrative-geography/district> <http://data.ordnancesurvey.co.uk/id/7000000000017297> .
?x <http://environment.data.gov.uk/def/bathing-water/sedimentTypesPresent> ?sediment .
}

We can again use the SPARQL federation to do something more interesting. The follow query returns both sediment types in bathing waters in Havant together with sediment types of bathing water in regions that touch Havant:

select ?x ?name ?sediment
where {
{
?x a <http://environment.data.gov.uk/def/bathing-water/BathingWater> .
?x <http://www.w3.org/2000/01/rdf-schema#label> ?name .
?x <http://statistics.data.gov.uk/def/administrative-geography/district> <http://data.ordnancesurvey.co.uk/id/7000000000017297> .
?x <http://environment.data.gov.uk/def/bathing-water/sedimentTypesPresent> ?sediment .
}
UNION
{
SERVICE <http://data.ordnancesurvey.co.uk/datasets/boundary-line/apis/sparql>
{ ?district <http://data.ordnancesurvey.co.uk/ontology/spatialrelations/touches>
   <http://data.ordnancesurvey.co.uk/id/7000000000017297> .
}
?x a <http://environment.data.gov.uk/def/bathing-water/BathingWater> .
?x <http://www.w3.org/2000/01/rdf-schema#label> ?name .
?x <http://statistics.data.gov.uk/def/administrative-geography/district> ?district .
?x <http://environment.data.gov.uk/def/bathing-water/sedimentTypesPresent> ?sediment .
}
}

Another great government open data resource is the Open Data Communities site. They have a SPARQL endpoint here. This federated SPARQL query (analogous to those above) can be used, for example, to find the Index of Multiple Deprivation Environment rank for Havant and surrounding districts. This works are follows:

select ?s ?imdrank
where
{
{
?s <http://purl.org/linked-data/sdmx/2009/dimension#refArea> <http://data.ordnancesurvey.co.uk/id/7000000000017297> .
?s <http://opendatacommunities.org/def/IMD#IMD-environment-rank> ?imdrank .
}
UNION
{
SERVICE <http://data.ordnancesurvey.co.uk/datasets/boundary-line/apis/sparql>
{ ?district <http://data.ordnancesurvey.co.uk/ontology/spatialrelations/touches>
<http://data.ordnancesurvey.co.uk/id/7000000000017297> .
}
?s <http://purl.org/linked-data/sdmx/2009/dimension#refArea> ?district .
?s <http://opendatacommunities.org/def/IMD#IMD-environment-rank> ?imdrank .
}
}

I will now leave it as an exercise to the reader to figure out how these all combine so you can ask for ‘all bathing waters in Havant and surrounding areas, and the IMD environment ranks of the areas containing those bathing waters’ – it is possible!

Please note that federated SPARQL can be slow…happy SPARQLing.

Advertisement
Categories: linked data, Semantic Web

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: