Earlier this month Google announced the release of the open source graph database/triplestore Cayley. This weekend I thought I would have a quick look at it, and try some simple queries using the Ordnance Survey Linked Data.
Cayley is written in Go, so first I had to download and install that. I then downloaded Cayley from here. As an initial experiment I decided to use the Boundary Line Linked Data, and you can grabbed the data as n-triples here. I only wanted a subset of this data – I didn’t need all of the triplestores storing the complex boundary geometries for my initial test so I discarded the files of the form *-geom.nt and the files of the form county.nt, dbu.nt etc. (these are the ones with the boundaries in). Finally I put the remainder of the data into one file so it was ready to load into Cayley.
It is very easy to load data into Cayley – see the getting started section part on the Cayley pages here. I decided I wanted to try the web interface so loading the data (in a file called all.nt) was a simple case of typing:
./cayley http –dbpath=./boundaryline/all.nt
Once you’ve done this point your web browser to http://localhost:64210/ and you should see something like:
One of the things that will first strike people used to using RDF/triplestores is that Cayley does not have a SPARQL interface, and instead uses a query language based on Gremlin. I am new to Gremlin, but seems it has already been used to explore linked data – see blog from Dan Brickley from a few years ago.
The main purpose of this blog post is to give a few simple examples of queries you can perform on the Ordnance Survey data in Cayley. If you have Cayley running then you can find the query language documented here.
At the simplest level the query language seems to be an easy way to traverse the graph by starting at a node/vertex and following incoming or outgoing links. So to find All the regions that touch Southampton it is a simple case of starting at the Southampton node, following a touches outbound link and returning the results:
If you want to return the names and not the IDs:
You can used also filter – so to just see the counties bordering Southampton:
The Ordnance Survey linked data also has spatial predicates ‘contains’, ‘within’ as well as ‘touches’. Analogous queries can be done with those. E.g. find me everything Southampton contains: