Archive

Posts Tagged ‘dbpedia’

All roads lead to? Experiments with Gephi, Linked Data and Wikipedia

March 26, 2014 3 comments

Gephi is “an interactive visualization and exploration platform for all kinds of networks and complex systems, dynamic and hierarchical graphs”. Tony Hirst did a great blog post a while back showing how you could use Gephi together with DBpedia (a linked data version of Wikipedia) to map an influence network in the world of philosophy. Gephi offers a semantic web plugin which allows you to work with the web of linked data. I recommend you read Tony’s blog to get started with using that plugin with Gephi. I was interested to experiment with this plugin, and to look at what sort of geospatial visualisations could be possible.

If you want to follow all the steps in this post you will need to:

Initially I was interested to see if there were any interesting networks we might visualise between places. In order to see how Wikipedia relates one place to another was a simple case of going to the DBpedia SPARQL endpoint and trying the following query:

select distinct ?p
where
{
?s a <http://schema.org/Place> .
?o a <http://schema.org/Place> .
?s ?p ?o .
}

- where s and o are places, find me what ‘p’ relates them. I noticed two properties ‘http://dbpedia.org/ontology/routeStart‘ and ‘http://dbpedia.org/ontology/routeEnd‘ so I thought I would try to visualise how places round the world were linked by transport connections.  To find places connected by a transport link you want to find pairs ‘start’ and ‘end’ that are the route start and route end, respectively, of some transport link. You can do this with the following query:

select ?start ?end
where
{
?start a <http://schema.org/Place> .
?end a <http://schema.org/Place> .
?link <http://dbpedia.org/ontology/routeStart> ?start .
?link <http://dbpedia.org/ontology/routeEnd> ?end .
}

This gives a lot of data so I thought I would restrict the links to be only road links:

select ?start ?end
where
{?start a <http://schema.org/Place> .
?end a <http://schema.org/Place> .
?link <http://dbpedia.org/ontology/routeStart> ?start .
?link <http://dbpedia.org/ontology/routeEnd> ?end .
?link a <http://dbpedia.org/ontology/Road> . }

We are now ready to visualise this transport network in Gephi. Follow the steps in Tony’s blog to bring up the Semantic Web Importer. In the ‘driver’ tab make sure ‘Remote – SOAP endpoint’ is selected, and the EndPoint URL is http://dbpedia.org/sparql. In an analogous way to Tony’s blog we need to construct our graph so we can visualise it. To simply view the connections between places it would be enough to just add this query to the ‘Query’ tab:

construct {?start <http://foo.com/connectedTo> ?end}
where
{
?start a <http://schema.org/Place> .
?end a <http://schema.org/Place> .
?link <http://dbpedia.org/ontology/routeStart> ?start .
?link <http://dbpedia.org/ontology/routeEnd> ?end .
?link a <http://dbpedia.org/ontology/Road> .
}

However, as we want to visualise this in a geospatial context we need the lat and long of the start and end points so our construct query becomes a bit more complicated:

prefix gephi:<http://gephi.org/>
construct {
?start gephi:label ?labelstart .
?end gephi:label ?labelend .
?start gephi:lat ?minlat .
?start gephi:long ?minlong .
?end gephi:lat ?minlat2 .
?end gephi:long ?minlong2 .
?start <http://foo.com/connectedTo> ?end}
where
{
?start a <http://schema.org/Place> .
?end a <http://schema.org/Place> .
?link <http://dbpedia.org/ontology/routeStart> ?start .
?link <http://dbpedia.org/ontology/routeEnd> ?end .
?link a <http://dbpedia.org/ontology/Road> .
{select ?start (MIN(?lat) AS ?minlat) (MIN(?long) AS ?minlong) where {?start <http://www.w3.org/2003/01/geo/wgs84_pos#lat> ?lat . ?start <http://www.w3.org/2003/01/geo/wgs84_pos#long> ?long .} }
{select ?end (MIN(?lat2) AS ?minlat2) (MIN(?long2) AS ?minlong2) where {?end <http://www.w3.org/2003/01/geo/wgs84_pos#lat> ?lat2 . ?end <http://www.w3.org/2003/01/geo/wgs84_pos#long> ?long2 .} }
?start <http://www.w3.org/2000/01/rdf-schema#label> ?labelstart .
?end <http://www.w3.org/2000/01/rdf-schema#label> ?labelend .
FILTER (lang(?labelstart) = ‘en’)
FILTER (lang(?labelend) = ‘en’)
}

Note that query for the lat and long is a bit more complicated that it might be. This is because DBpedia data is quite messy, and many entities will have more than one lat/long pair. I used a subquery in SPARQL to pull out the minimum lat/long for all the pairs retrieved. Additionally I also retrieved the English labels for each of the start/end points.

Now copy/paste this construct query into the ‘Query’ tab on the Semantic Web Importer:

Screen Shot 2014-03-26 at 15.54.34

Now hit the run button and watch the data load.

To visual the data we need to do a bit more work. In Gephi click on the ‘Data Laboratory’ and you should now see your data table. Unfortunately all of the lats and longs have been imported as strings and we need to recast them as decimals. To do this click on the ‘More actions’ pull down menu and look for ‘Recast column’ and click it. In the ‘Recast manipulator’ window go to ‘column’ and select ‘lat(Node Table)’ from the pull down menu. Under ‘Convert to’ select ‘Double’ and click recast. Do the same for ‘long’.

Screen Shot 2014-03-26 at 16.01.19

when you are done click ‘ok’ and return to the ‘overview’ tab in Gephi. To see this data geospatially go to the layout panel and select ‘Geo Layout’. Change the latitude and longitude to your new recast variable names, and unclick ‘center’ (my graph kept vanishing with it selected). Experiment with the scale value:

Screen Shot 2014-03-26 at 16.09.49

You should now see something like this:

Screen Shot 2014-03-26 at 16.11.13

in your display panel (click image to view in higher resolution).

Given that this is supposed to be a road network you will find some oddities. This it seems to down to ‘European routes’ like European route E15 that link from Scotland down to Spain.

A Crude BBC Places Linked Data mashup

January 20, 2011 4 comments

Last night I did some more experimenting with the Python rdflib directory. This time I did a crude (it’s not that pretty or polished yet) mashup of some of the BBC linked data and DBpedia linked data.

The Beeb have been in the linked data business for a while and their initial efforts were around programmes and music (but you also check out the great linked data powered wildlife finder).

Recently they’ve started to experiment with tagging their programmes with relevant people, places and organisations.

I decided it might be quite nice to have a simple mashup showing TV and radio shows about different places. To this end I did a quick linked data mashup to produce some KML showing this information.

To do this I again used Python’s rdflib. Here it was a simple case of following links from a place to a TV/radio programme and loading the RDF into a graph. It was then a case of executing a simple SPARQL query over this graph to get a KML file containing programme details and a lat/long coordinate for plotting it on a map. The BBC place data did not contain lat/long for all the places, but luckily they did include a ‘sameAs’ to the place information in DBpedia. Here all we had to do was follow the ‘sameAs’ link and load in the DBpedia data.

I explained how to use rdflib to do this sort of thing in my last post, but meanwhile here is the source code and here is the KML. The KML can be used with a mapping API of your choice, but for a quick view drop the KML URL into the search box on Google maps or view in Google Earth.

At the moment this is a bit clunky, but it’s just a start…

Follow

Get every new post delivered to your Inbox.

Join 2,189 other followers