Posts Tagged ‘Genealogy’

My Family Tree Linked Data Resurrected with some RAGLD Help

November 1, 2012 Leave a comment

I’ve been looking for a new home for my family tree linked data since the Talis Store holding the data was turned off.  A new (temporary) home has been found thanks to the RAGLD project. RAGLD is all about creating simple tools and services to allow people to publish and consume linked data. One of these simple RAGLD services is a linked data publishing platform. I have moved my Family Tree linked data onto a RAGLD service for the time being. You can find the service here. The service still needs some polish, but it works well as a very simple platform for publishing linked data. For more information on how that works be sure to check the RAGLD website, or follow RAGLD on Twitter.

Meanwhile here is a SPARQL endpoint and here are some example URIs: my grandfather and I.

Some quick linked data hacks

June 16, 2010 22 comments

In previous posts I discussed the work I’d been doing on my family tree linked data. I decided it might be interesting to plot places of birth for my ancestors on a map to get a true idea of where they all came from. The result, a faceted browser that lets me filter based on family name or birth place, can be seen here. This mashup was very easy to achieve using linked data and a tool called Exhibit. To quote: “Exhibit lets you easily create web pages with advanced text search and filtering functionalities, with interactive maps, timelines, and other visualizations…”.

As I explained in a previous post the places of birth for family members were recorded in my family tree linked data by linking to place resources in DBpedia, for example: In order to perform the mashup I need lat/long values for each place of birth. One option might have been to do some kind of geo-coding on the place name using an API. However, I didn’t relish the world of pain I’d get from retrieving data in some arbitrary XML format or the issues with ambiguities in place names. The easiest way to get that information was to enrich my family tree data by consuming the linked data I’d connected to. This is how I did it…

First I ran a simple SPARQL query to find all the places referenced:

select distinct ?place
where {?a <;
?place .}

(match on all triples of the form ?a <; ?place, and then return all distinct values of ?place).

The results are URIs of the form I then used CURL (a command line tool for transferring data with URL syntax) to retrieve the RDF/XML behind of the URIs:

curl -H “Accept: application/rdf+xml”

This basically says give me back RDF/XML for the resource It was then easy to insert this RDF/XML into my triplestore (RDF database). I can do this because my family tree data was in linked data format (RDF) and linked to an existing resources also in RDF – so there was no problem with integrating data in different schemas/formats.

Now all I had to do was retrieve the information I needed to do the mashup. This was done using a SPARQL query:

select ?a ?name ?familyname ?birthdate ?birthplacename ?latlong
FILTER langMatches( lang(?birthplace), “EN” )
ORDER BY ?birthdate

Given that Exhibit works really well with JSON I opted to return the results to the query in that format (SPARQL queries are typically returned as XML or JSON). It was then a simple matter of making the resultant JSON into a suitable form that Exhibit can process.

I did another simple mashup using the BBC linked data here. This followed a similar process, except that the BBC had already enhanced there data by following links to DBpedia. This BBC mashup basically lets you find episodes of brands of radio show that play your favourite artists/genres. The BBC data contains links between artists and radio shows. There are ‘sameAs’ links from the BBC artist data to DBpedia. It is DBpedia that then provides the connection between artists and their genre(s).

Hopefully this shows the power of linked data in a simple way. There is a simple pattern to follow…

1) Make data, and make that data available in RDF. People can then link to you, and you can link to other people who have data in RDF. So I made family tree data in RDF, the BBC made music/programme data in RDF.

2) Link to linked data resources on the web (in this case we both linked to DBpedia).

3) Enhance your data by consuming the data behind those links – this is trivial because they are both in the linked data format RDF.

4) Make something cool/useful :)

In fact this will be even easier to build useful services when the linked data API is in use as this will bypass the need for SPARQL in the many cases. As more and more people provide linked data we will have an easy way to provide services built on top of combined data sources, and the linked data API will make it web 2.0 friendly for those (understandably?) put off by SPARQL.

Genealogy and Linked Data (revisited)

November 4, 2009 10 comments

I now have a new improved version of my family tree up as linked data here. To produce this family tree I converted the original family tree that my parents created using a perl script that takes GEDCOM to RDF. I then manually cleaned up the RDF to get the URIs in a form that I wanted.

This resulted in an RDF file giving information about parent/child, sibling and spouse relations for my family members. The vocabularies (or ontologies) used for this were FOAF, BIO and RELATIONSHIP.

I was interested in displaying more than just parent/child, sibling and spouse relationships and decided a simple extension could be to have grandparent/grandchild and ancestor/descendant information. To compute this information I used the Protege 4 OWL 2 editor. To compute grandparent information I used a property of OWL 2 called “property chains“. The property chain for computing grandchild relationships from child ones was straightforward:

childOf o childOf -> grandChildOf

(or for those who prefer rules: childOf(?x,?y) , childOf(?y,?z) -> grandChildOf(?x,?z) )

This simply states that “the child of the child of someone is a grandchild of that someone”.

The ancester information was event more straightforward to compute. Here we just make the property parentOf a subproperty of ancesterOf and then make ancesterOf a transitive property.

Given the two axioms above we can then let the OWL reasoner in Protege 4 do all the hard work and compute the implicit relationships based on the explicitly stated ones. Anyone interested in using OWL to compute more family relations should read this paper by Robert Stevens.

So I now have some RDF containing parent/child, ancester/descendant, sibling and spouse relationships. Also in this data are notions of family groups and information about birth [1] and death events. These events contain information about dates and places (given as text) of birth/death. Having this information as literals is not very interesting as it means I then have to go and use Google (or similar) to find additional information about the dates/places. To get round this (and create some links in my linked data) I decided to connect the places of birth/death to the corresponding resource in DBpedia (an RDF version of wikipedia) and do similarly for the dates [2]. An example of this can be seen here This means I can now find additional information about a persons place of death/birth by following the links in the data if I should choose to do so. To link birth/death events to dates/place I used the event ontology.

In order to host the data as linked data I used the Talis Platform and the Paget (2) PHP library.

There is a SPARQL endpoint for the data here. We can use this to query for my uncles as follows:

PREFIX rel: <;
PREFIX foaf: <;

PREFIX family: <;
select ?uncle ?name
family:I0243 rel:childOf ?parent .      ( finds my parents)
?parent rel:siblingOf ?uncle .          (finds my parents siblings)
?uncle foaf:gender “male” .          ( find the male siblings)
?uncle foaf:name ?name .         (this returns their names)

My next plan is to build some mash-ups using this data. Such a mash-up could use resources on the web of linked data to find famous people born in the same place/year as various family members, identify BBC programmes that are about said places etc. etc.

Now all I need to do is find a long lost relative who is also into genealogy and linked data so I can connect some nodes…what are the chances???




[1] – for obvious privacy reasons no birth information is given for people still living.

[2] – this was a fairly tedious manualish process – but some scripting helped.



Reblog this post [with Zemanta]

Genealogy and the Semantic Web 2

April 18, 2009 2 comments

I’ve been busy converting my parents hard work on their  family tree into RDF. I blogged about initial attempts here. It’s far from finished, but at around 500,000 triples already it looks like it’s going to be a lot of RDF!

You can view the RDF (as it is) here, but seeing as RDF is for machines a more human friendly version can be browsed here. So far I’ve been concentrating on linking places of death and birth to various other datasets include geonames, DBpedia, Freebase and Ordnance Survey (though there still a fair few places to link).

To be done:

1) Finish connecting all the places.

2) Sort date formats out.

3) Turn into linked data with dereferencable URIs and content negotation.

A more detailed write up when it’s all finished…

Reblog this post [with Zemanta]

Genealogy and the Semantic Web

January 21, 2009 11 comments

My folks have a keen interest in genealogy, and have built up quite an impressive family tree over the past few years. I always thought that genealogy would be a potentially cool application for the semantic web (imagine several independently constructed family trees being connected via their common nodes). It seems I wasn’t the first person to think this as this recent blog post from Dan Brickley suggests. Dan has written quick and horrid Perl script (his words) to convert the common family tree GED format to RDF/XML. A dump of my family tree in RDF/XML can be found here.

The family trees contain information about births and deaths of people. All of these events are then connected to places. I think one weakness of the current conversion script is that it represents the place information as a string, for example:

<foaf: Person rdf:about=”I1884.xml#I1884″>
<foaf:name>William Parsonage</foaf:name>
<bio:date>30 AUG 1721</bio:date>

It would be far more interesing if instead the place was actually connected to a URI representing the place on the semantic web. For example, in this case there are a number of URIs for Birmingham on the semantic web, for example from Ordnance Survey or from Geonames. The RDF/XML could them be modified as follows:

<foaf: Person rdf:about=”I1884.xml#I1884″>
<foaf:name>William Parsonage</foaf:name>
<bio:date>30 AUG 1721</bio:date>
<bio:place rdf:resource=”“/>

If I get chance at the weekend I’ll see how much work it will take to add this information to the family tree RDF/XML. It will also be interesting to see if the combination of these two datasets provides extra information and insights to budding genealogists.

Reblog this post [with Zemanta]

Get every new post delivered to your Inbox.

Join 2,092 other followers