Home > linked data, Semantic Web > Some quick linked data hacks

Some quick linked data hacks


In previous posts I discussed the work I’d been doing on my family tree linked data. I decided it might be interesting to plot places of birth for my ancestors on a map to get a true idea of where they all came from. The result, a faceted browser that lets me filter based on family name or birth place, can be seen here. This mashup was very easy to achieve using linked data and a tool called Exhibit. To quote: “Exhibit lets you easily create web pages with advanced text search and filtering functionalities, with interactive maps, timelines, and other visualizations…”.

As I explained in a previous post the places of birth for family members were recorded in my family tree linked data by linking to place resources in DBpedia, for example: http://www.johngoodwin.me.uk/family/event1917. In order to perform the mashup I need lat/long values for each place of birth. One option might have been to do some kind of geo-coding on the place name using an API. However, I didn’t relish the world of pain I’d get from retrieving data in some arbitrary XML format or the issues with ambiguities in place names. The easiest way to get that information was to enrich my family tree data by consuming the linked data I’d connected to. This is how I did it…

First I ran a simple SPARQL query to find all the places referenced:

select distinct ?place
where {?a <http://purl.org/NET/c4dm/event.owl#place&gt;
?place .}

(match on all triples of the form ?a <http://purl.org/NET/c4dm/event.owl#place&gt; ?place, and then return all distinct values of ?place).

The results are URIs of the form http://dbpedia.org/resource/Luton. I then used CURL (a command line tool for transferring data with URL syntax) to retrieve the RDF/XML behind of the URIs:

curl -H “Accept: application/rdf+xml” http://dbpedia.org/resource/Luton

This basically says give me back RDF/XML for the resource http://dbpedia.org/resource/Luton. It was then easy to insert this RDF/XML into my triplestore (RDF database). I can do this because my family tree data was in linked data format (RDF) and linked to an existing resources also in RDF – so there was no problem with integrating data in different schemas/formats.

Now all I had to do was retrieve the information I needed to do the mashup. This was done using a SPARQL query:

select ?a ?name ?familyname ?birthdate ?birthplacename ?latlong
where
FILTER langMatches( lang(?birthplace), “EN” )
}
ORDER BY ?birthdate

Given that Exhibit works really well with JSON I opted to return the results to the query in that format (SPARQL queries are typically returned as XML or JSON). It was then a simple matter of making the resultant JSON into a suitable form that Exhibit can process.

I did another simple mashup using the BBC linked data here. This followed a similar process, except that the BBC had already enhanced there data by following links to DBpedia. This BBC mashup basically lets you find episodes of brands of radio show that play your favourite artists/genres. The BBC data contains links between artists and radio shows. There are ‘sameAs’ links from the BBC artist data to DBpedia. It is DBpedia that then provides the connection between artists and their genre(s).

Hopefully this shows the power of linked data in a simple way. There is a simple pattern to follow…

1) Make data, and make that data available in RDF. People can then link to you, and you can link to other people who have data in RDF. So I made family tree data in RDF, the BBC made music/programme data in RDF.

2) Link to linked data resources on the web (in this case we both linked to DBpedia).

3) Enhance your data by consuming the data behind those links – this is trivial because they are both in the linked data format RDF.

4) Make something cool/useful :)

In fact this will be even easier to build useful services when the linked data API is in use as this will bypass the need for SPARQL in the many cases. As more and more people provide linked data we will have an easy way to provide services built on top of combined data sources, and the linked data API will make it web 2.0 friendly for those (understandably?) put off by SPARQL.

About these ads
  1. June 17, 2010 at 3:15 pm

    John. Quick question. If I understand this correctly, the data that you end up using in the exhibits is all “canned”, in that you pull all the stuff into your own store, augment with additional data from, e.g. dbpedia or bbc music. Then you suck it all out and display in an exhibit.

    Is that right?

    • john225
      June 17, 2010 at 3:25 pm

      That’s pretty much it! I wanted to do something quick and easy. For the family tree stuff it really was:

      Make data, link to DBpedia, pull in info from DBpedia to my store, SPARQL to get JSON, tidy JSON a bit and then put into Exhibit. I guess it would be possible to pass the JSON from the SPARQL queries straight into Exhibit.

  2. June 17, 2010 at 4:20 pm

    Supplementary (and possible unfair) question then :-). Is this then truly a linked data app?

    • john225
      June 17, 2010 at 4:44 pm

      Hmmm – well it’s certainly an app built from linked data :) Linked data was published, linked were made, data was consumed via links…and there was some SPARQL.

      So my question to you…what does it take to qualify to be a linked data application? :)

  3. June 17, 2010 at 4:51 pm

    Well, as I said, it’s probaby an unfair question! I have in my mind something that is in some ways more *dynamic* than the application you have. But I’m not sure I can necessarily give a definition of what a linked data app is. I’ll know one when I see one!

    Note that this isn’t intended as a criticism of what you’ve done — I think it’s really nice, and the reason I was thinking about these questions is because I want to do something similar (for my scuba diving logs).

    • June 17, 2010 at 5:16 pm

      IMHO yes, of course it is – there is a curious notion that Linked Data Applications will somehow handle any and all data from anywhere on the web, that’s not the case. Applications are build with specific task to do, working over a specific set or type of data – and that’s what your application does.

      The difference between a linked data app and any ‘normal’ app, is that a linked data app uses the RDFBus.

      Best,

      Nathan

      • john225
        June 17, 2010 at 5:52 pm

        thanks for you comment Nathan. Sean’s question was interesting. I would like my app to be a bit more dynamic…but I made the most of the tools at my disposal.

    • john225
      June 17, 2010 at 5:51 pm

      It was an interesting question. I did do another mashup that is arguably more dynamic:

      http://www.johngoodwin.me.uk/boundaries/meshup.html

      This queries the at numerous SPARQL endpoints and brings it all together.

      I’d really like to see functionality in triplestores that lets you pulli in data from referenced URIs and then maybe check for updates periodically. I guess my app was a cached linked data app?

      • June 18, 2010 at 8:11 am

        I like “cached linked data app”. It captures just what I was thinking.

  4. Nathan
    June 17, 2010 at 5:59 pm

    ‘I’d really like to see functionality in triplestores that lets you pulli in data from referenced URIs and then maybe check for updates periodically.’

    exactly! sparql on the client side (or edge), rdf delivered via HTTP, triplestore is HTTP caching aware so test if it’s cached version is up to date or not on query, and go from there. if you look at any of timbl’s diagrams you always see sparql on the edge, not on the server side..

    • john225
      June 17, 2010 at 6:05 pm

      so basically my app did that – except for the update part. I retrieved the RDF via CURL, and then put that RDF into my triplestore. I did that via scripts, but built in functionality (and a bit cleverer) would be great.

  5. June 17, 2010 at 8:10 pm

    Thanks very much for this post! I think there are far too few examples like John’s in the wild…

    On the question of whether John’s app is truly a “linked data application,” I think it falls on the Linked Data Application Spectrum. It has many of the desired attributes, such as using data from a variety of linked data sources, but as you’ve pointed out some of these sources are static. But guess what: a large number of the Data.gov datasets (for example) are static, too! Not only that, but many of those datasets have been converted from their “raw” form.

    Regardless of whether it uses dynamically sourced data, I would refer to John’s as a Linked Data Microapp, which is the name I’m giving to such “lightweight” but useful applications…

    • john225
      June 17, 2010 at 8:23 pm

      Thanks for the feedback John. I’d really like to see some more linked data clients that perhaps allow the sort of thing I did here with less static data.

  6. June 18, 2010 at 12:44 pm

    Our definition of Linked Data Application (from the Consuming Linked Data tutorial):

    Software system that makes use of data on the web from multiple data sets (up to now, this is a “web 2.0″ app) AND that benefits from links between the datasets

    http://www.slideshare.net/juansequeda/06-linkeddataapplications

    In the slides you can find characteristics of a Linked Data app. We generated these after our experience of the Linked Data-a-thon at ISWC2009

    • November 8, 2010 at 2:43 pm

      Juan,

      A Linked Data aware Application cannot be an one that consumes data solely from the Web! A Linked Data aware application is one that’s capable of processing hypermedia based structured data over networks.

      Note:
      1. WWW == Hypermedia based Structured Docs + URIs (Internet i.e. TCP/IP over WAN)
      2. GGG == Hypermedia based Structured Data + URIs (Internet i.e. TCP/IP over WAN).

      Intranets and Extranets put URIs to use within LANs or LAN/WAN hybrids.

      What would you call a native application that consumed hypermedia based abstract entities (Customers, Suppliers, Employees, Orders, Invoices, Products, Competitors, Countries etc..) within an enterprise setting?
      We have to stop confining the concept of Linked Data to the Web or RDF, ASAP.

  7. June 19, 2010 at 8:37 pm

    Nice hacks! This is really great.

  8. November 8, 2010 at 2:37 pm

    A Linked Data aware Application cannot be an one that consumes data solely from the Web! A Linked Data aware application is one that’s capable of processing network based hypermedia based structured data.

    What would you call a native application that consumed hypermedia based abstract entities (Customers, Suppliers, Employees, Orders, Invoices, Products, Competitors, Countries etc..) within an enterprise setting?

    We have to stop confining the concept of Linked Data to the Web or RDF, ASAP.

  9. November 8, 2010 at 2:41 pm

    Little tweak:

    A Linked Data aware Application cannot be an one that consumes data solely from the Web! A Linked Data aware application is one that’s capable of processing hypermedia based structured data over networks.

    Note:

    1. WWW == Hypermedia based Structured Docs + URIs (Internet i.e. TCP/IP over WAN)
    2. GGG == Hypermedia based Structured Data + URIs (Internet i.e. TCP/IP over WAN)

    Intranets and Extranets put URIs to use within LANs or LAN/WAN hybrids.

    What would you call a native application that consumed hypermedia based abstract entities (Customers, Suppliers, Employees, Orders, Invoices, Products, Competitors, Countries etc..) within an enterprise setting?

    We have to stop confining the concept of Linked Data to the Web or RDF, ASAP.

  1. October 25, 2010 at 4:42 pm
  2. December 31, 2010 at 1:45 am
  3. January 2, 2011 at 10:18 am
  4. January 4, 2011 at 2:25 am

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 2,189 other followers

%d bloggers like this: