Archive

Posts Tagged ‘Talis Platform’

Genealogy and Linked Data (revisited)

November 4, 2009 10 comments

I now have a new improved version of my family tree up as linked data here. To produce this family tree I converted the original family tree that my parents created using a perl script that takes GEDCOM to RDF. I then manually cleaned up the RDF to get the URIs in a form that I wanted.

This resulted in an RDF file giving information about parent/child, sibling and spouse relations for my family members. The vocabularies (or ontologies) used for this were FOAF, BIO and RELATIONSHIP.

I was interested in displaying more than just parent/child, sibling and spouse relationships and decided a simple extension could be to have grandparent/grandchild and ancestor/descendant information. To compute this information I used the Protege 4 OWL 2 editor. To compute grandparent information I used a property of OWL 2 called “property chains“. The property chain for computing grandchild relationships from child ones was straightforward:

childOf o childOf -> grandChildOf

(or for those who prefer rules: childOf(?x,?y) , childOf(?y,?z) -> grandChildOf(?x,?z) )

This simply states that “the child of the child of someone is a grandchild of that someone”.

The ancester information was event more straightforward to compute. Here we just make the property parentOf a subproperty of ancesterOf and then make ancesterOf a transitive property.

Given the two axioms above we can then let the OWL reasoner in Protege 4 do all the hard work and compute the implicit relationships based on the explicitly stated ones. Anyone interested in using OWL to compute more family relations should read this paper by Robert Stevens.

So I now have some RDF containing parent/child, ancester/descendant, sibling and spouse relationships. Also in this data are notions of family groups and information about birth [1] and death events. These events contain information about dates and places (given as text) of birth/death. Having this information as literals is not very interesting as it means I then have to go and use Google (or similar) to find additional information about the dates/places. To get round this (and create some links in my linked data) I decided to connect the places of birth/death to the corresponding resource in DBpedia (an RDF version of wikipedia) and do similarly for the dates [2]. An example of this can be seen here http://www.johngoodwin.me.uk/family/event1917. This means I can now find additional information about a persons place of death/birth by following the links in the data if I should choose to do so. To link birth/death events to dates/place I used the event ontology.

In order to host the data as linked data I used the Talis Platform and the Paget (2) PHP library.

There is a SPARQL endpoint for the data here. We can use this to query for my uncles as follows:

PREFIX rel: <http://purl.org/vocab/relationship/&gt;
PREFIX foaf: <http://xmlns.com/foaf/0.1/&gt;

PREFIX family: <http://www.johngoodwin.me.uk/family/&gt;
select ?uncle ?name
where
{
family:I0243 rel:childOf ?parent .      ( finds my parents)
?parent rel:siblingOf ?uncle .          (finds my parents siblings)
?uncle foaf:gender “male” .          ( find the male siblings)
?uncle foaf:name ?name .         (this returns their names)
}

My next plan is to build some mash-ups using this data. Such a mash-up could use resources on the web of linked data to find famous people born in the same place/year as various family members, identify BBC programmes that are about said places etc. etc.

Now all I need to do is find a long lost relative who is also into genealogy and linked data so I can connect some nodes…what are the chances???

 

 

 

[1] – for obvious privacy reasons no birth information is given for people still living.

[2] – this was a fairly tedious manualish process – but some scripting helped.

 

 

Reblog this post [with Zemanta]

Mash-ups are so last year…

June 14, 2009 3 comments

Mash-ups are cool – ever since Ordnance Survey, Google, Yahoo! and Microsoft launched there various mapping APIs we’ve seen quite a few of them. This weekend I’ve been experimenting with creating a map mesh-up. I’m not sure if there is any strict definition of a mesh-up, but Kinglsey Idehen gave a pretty good account of mesh-up versus mash-up in this blog entry. I’ll leave it up to you the reader to decide if what I have done is truly a mesh-up, but I like to think I did the best I could given the current semantic web infrastructure.

Given my day job I thought it would be cool to do some kind of map mesh-up around regions in the UK (however being a typical researcher I’ve only done four locations so far just to prove the concept). The new version of Ordnance Survey’s mapping API (OS OpenSpace) provides easy API calls to let you display the boundaries of administrative regions in Great Britain (except for civil parishes and communities). This made OS OpenSpace a no brainer for this mesh-up (and of course the superior cartography is an added bonus :)). In order to process the RDF I used the ARC PHP library.

I’ll now explain how I did each of the various mesh-ups starting with the most straightforward one – the basic map with region information (e.g Southampton). This basic map mesh-up was made using the Ordnance Survey RDF for administrative units in Great Britain. This is hosted as linked data on the rkbexplorer site and has a SPARQL endpoint. This RDF data contains topological relations and name information for the administrative regions in Great Britain. For example, take a look at Southampton. For a given region the ARC library was used to issue a SPARQL query to find the bordering regions, contained regions and containing regions along with the area of the region. The result of these queries was then displayed in the map information pop-out. So to find the bordering regions for Southampton the query is very straightforward:

SELECT ?border
WHERE
{
<http://os.rkbexplorer.com/id/osr7000000000037256&gt; admingeo:borders ?border .
}

The family tree mesh-up was done in a similar way. I documented in a previous blog entry how I had started converting my family tree into RDF. In fact since my last blog entry I now have that data available as linked data (this was done using Paget, for example: http://www.johngoodwin.me.uk/family/I0002). The data was stored on the Talis Platform and again ARC was used to do a SPARQL query. You may notice for the Birmingham family tree map I list members of my family that were born in Birmingham and died in Birmingham. I also list relatives that were born in areas bordering Birmingham. I was able to do this because my family tree data was connected to the Ordnance Survey boundaries RDF. So from the OS data I could find all areas bordering Birmingham, and then return all family members born in these areas from my family tree data. Because the data was linked over the web is was easy to do this in a very simple SPARQL query:

SELECT ?s ?name
WHERE
{

?place admingeo:borders <http://os.rkbexplorer.com/id/osr7000000000000018&gt;.

?s dbpedia-owl:birthplace ?place .

?s foaf:name ?name
}

The BBC mesh-ups are arguably more interesting. The BBC recently announced a SPARQL endpoint for its RDF data. An example of the queries you can do are given here. The observant amongst you will notice that the BBC data does provide location information, but the URIs for the location are currently taken from DBpedia and not from the Ordnance Survey data. To get round this I used a new service called sameas.org. The sameas.org service offers a service that helps you to find co-references between different data sets. You can use this to look up other sources that represent your chosen URI. For example http://os.rkbexplorer.com/id/osr7000000000037256 has the equivalent URIs given here.

However, I didn’t want to hard code the equivalent URIs in my code. I’ll explain what I did using the Southampton example. First I issued a call to sameas.org to look up coreferences for the Ordnance Survey Southampton URI. I returned the URIs as an RDF file and used the ARC library to parse the RDF file for equivalent resources from dbpedia. I then issued a SPARQL query using the dbpedia URIs to return the artist/programme information from the BBC SPARQL endpoint.  So in a nutshell:

  1. take Ordnance Survey URI
  2. issue a look-up for that URI to sameas.org
  3. return URIs in an RDF file
  4. parse the RDF file using ARC for dbpedia URIs
  5. issue query to BBC endpoint using the dbpedia URIs.

The revyu mesh-up was done in a similar way.

I hope this all made sense. Comments and questions welcome – though please no comments on my HTML/web design being very 1995. It’s all about the RDF for me  :)

The mesh-up is here http://www.johngoodwin.me.uk/boundaries/meshup.html

Reblog this post [with Zemanta]
Follow

Get every new post delivered to your Inbox.

Join 2,092 other followers