John's Weblog

Using Machine Learning to write the Queen’s Christmas Message

December 21, 2017 John G Leave a comment

In their excellent book “The Indisputable Existence of Santa Claus” Hannah Fry (aka @FryRsquared) and Thomas Oléron Evans (aka @Mathistopheles) talked about using Markov Chains to generate the Queen’s Christmas message. You can read a bit about that here. After reading that chapter I asked Hannah and Thomas if they had considered repeating this using recurrant neural networks. A couple of years ago Andrej Karpathy wrote a blog that he summarised as follows:

We’ll train RNNs to generate text character by character and ponder the question “how is that even possible?”

In his blog he posed the question:

It looks like we can learn to spell English words. But how about if there is more structure and style in the data?

and went on to train an rNN to write text in the style of Shakespeare. I recommend you read his blog to see the examples and get an idea of what is going on.

You might also want to watch this video from the excellent Computerphile on a similar subject.

In a previous blog I decide to have a go at using rNNs to generate lyrics for my favourite band New Model Army. Thomas was kind enough to send me the text for the Queen’s previous speeches that he and Hannah had used in their work, and I thought I’d see how good rNNs were at generating a Royal Christmas message. I didn’t do anything particular fancy here…just followed the code and examples in Karpathy’s original blog on a fairly standard home desktop machine.

Here are some of the results… (I take no responsibility for any text generated by the neural network :)).

One parameter that can be adjusted to generate the output is ‘temperature’. This takes a value between 0 and 1. Lower temperature will cause the model to make more likely, but also more boring and conservative predictions. Higher temperatures cause the model to take more chances and increase diversity of results, but at a cost of more mistakes. Running with a temperaure of 0.5 we get:

i want to send a special message of the human spirit are much to be for everyone this is a happy and part of the christmas story so in the same streed. we are all truly new zealand in an example of the commonwealth has an equal place in the lives of men and women and what they have to show that they are all life in their own peoples of the commonwealth which we have come to the sacrifice of all the fortunes of hope and there are many heart of our grandchildren the presents the prince of wales and will have a preacher group of the commonwealth. on a strong and spreading the responsibility to the commonwealth who are much on the courage of the modern world and the commonwealth games and there are many others at home and the rest of the world have been inspired by the courage and enjoyment and the present peace of a commonwealth which is how to promote the games and from all walks of life and to become a growing peace on earth which we should me to see the strength to try to do so in the world today. it was a great conflict remain inspired by the courage and prince philip and i were able to see buckingham palace which the most special year for the commonwealth is to be movement and in a different furmestand and they are all source of spirit all over the world have shared a moment in the year and the christmas message is to be consed rather a quicter and the contribution of the family of nations and i have meen and wonderful in both these ordarisasions or a common sense and encouragement. this year i am speaking to you from the same family make the world in the world. the christmas message is so many of them and i have difference to the best of ourselves. and this year there are something to look at the world they are may be a life of all the world is the commonwealth is a sense of belonging to the world i was well as they are all sort of moment the more depression

I’m not sure this makes a lot of sense, and there are a few oddities/errors in the text.

I tried lowering the temperature to 0.3:

i hope that the christmas message is to be there and it is not only a very happy christmas to you all. i am speaking to you from the commonwealth and i have been able to see that the commonwealth and i have been deeply touched this year i am speaking to you from the same family friends and communities as the future of the commonwealth and as i have always given to many of you will have heard of the more conscious of the commonwealth and around us we should be able to see that the commonwealth and i have been deeply touched the strength to try to do the same family gathering in the commonwealth and i have all been about the future as well as a great comfort to us all the problems of the commonwealth and i have seen in their own way the problems of the commonwealth and i have been deeply touched the world the progress of the commonwealth and around us we are common throughout the commonwealth who are struck by the state of the commonwealth and i have depended to see in the world today that we should remember those who have seen in their own way a celebration of the child who was born at christmas time for families and friends will never be very different problems but it is not only a time for reflection and confidence to the commonwealth and i have all been about the world that we should not be our lives and to remind us of the future. i am speaking to you from the commonwealth and i have been deeply touched the world that we can all try to make a splendid birthday and the commonwealth and i have been floods and sadness and the best of the world have been able to discuss the best of ourselves. i believe that this christmas day i want to send a special message of hope in the face of hardship is nothing new that of the commonwealth who have seen in their lives in the commonwealth and in the commonwealth and i have been able to discuss the best of ourselves. we are all live together as a great daily and its war the commonwealth

As suggested the the result here is a bit more predictable/boring.

It is also possible to prime the model with some starting text using. This starts out the RNN with some hardcoded characters to warm it up with some context before it starts generating text.

Using a temperature of 0.5 and prime text of ‘Each Christmas’ we get:

each christmas is a time for reflection and grandchildren the more depends of a dedication of the commonwealth and around us of the future of the commonwealth games in the rich and proud of life in a great religions and its members. but i am also speaking by instince in the right through the members of my family and i have been great and personal responsibility to our lives and they say to me that there are many happy or so many countries of the future with the prince of peace and happiness and self-respect for the commonwealth and is a happy and prosperous future of the world we will be achieved with the opportunity to help them are a contribution to the powerful ways of spirit and learning to see the problems of the commonwealth and i have seen in their inestivation the peoples of the commonwealth and i have been for the better but that is possible to the people and the christians it’s all those who are great practical form we have all been for them to be a chance to get the gift of a carren of different lives the service of their people throughout the world. we are all live in the rolin to our children and grandchildren the present generation to see this country and arrived the same for one another which is most popularity and the rest of the world and all the great commonwealth. pronomed news the responsibilities for the duty of events which we have all been the holidic of science. it is they all who are so there and shared heritage and sometimes in saint paul’s and much considerate and to communication the human spirit. and they can be a contralle commonwealth and a firm landmark in the course of the many servicemen and women who are broken for many that can be a precious given us to be a witness this continuing that we can all be the contribution of the commonwealth and as we all have all the features of the love of hope and goodwill. this year has constantly will be overcome. i believe that this year i am speaking to you from the hope of a determination and continues to all

Using a temperature of 0.5 and prime text of ‘This past year’ we get:

this past year has been one to see that what i have been for the commonwealth and around us we are thinking of the problems show in the face of the fact that he wanted the strongly that we should show many of them service is the birth of the commonwealth. it is a contribution of a lives of family and friends for parents and religious difficulties at the commonwealth and as we all share the future as the future of the future. it was a moment and that the christmas message of the things that have been god shown in the earliest care and for the commonwealth can give the united kingdom and across the commonwealth. it is a helping one of the march of people who are so easy to go but the rest of the world have shaped for their determination and courage of what is right the life of life is word if we can do the state of the same time last month i was welcomed as you and that the opportunity to honour the most important of the thread which have provided a strong and family. even the commonwealth is a common bond that the old games and in the courage which the generations of the commonwealth and i have those to succeed without problems which have their course all that the world has been complete strangers which could have our response. i believe that the triditings of reconciliation but there is nothing in and happy or acts of the commonwealth and around us by science the right of all the members of the world have been difficult and the benefits of dreads and happiness and service to the commonwealth of the future. i wish you all a very happy christmas to you all.

So there you have it. Not sure we’ll be replacing the Queen with an AI anytime soon. Merry Christmas!

Categories: Uncategorized Tags: christmas, machine learning, queen, Recurrent Neural Networks, rnn

Using Recurrent Neural Networks to Hallucinate New Model Army Lyrics

September 24, 2016 John G 1 comment

I decided to follow the example of Andrei Karpathy to see if I could use recurrent neural networks to hallucinate New Model Army lyrics. I decided to train a 3-layer RNN with 512 hidden nodes on each layer. Full technical details are here. If you want to compare the output with the original lyrics you can by reading them here. Here is the output, not bad considering the relatively small size of the input (if you’re not a fan you’ll have to take my word for it):

the thick black days below where the desert with their backs the mountains in the blood red sun the sky i can hear us in the darkness they got their victorious and the black sky and the secrets come back to the sky the sky i see you and i can hear us all breathing i can hear us all breathing i can hear us all breathing i wanted to see him face and the lovers and the backstreets and the country breaks the shadows of the foghorn on the last breath there is still sprays on the ridge and the hands are on the breaking skies the boys are heading good for the father and the lights go out the sea and i was so let them on the earth and the days of the end we all feel the lights are blazing all the screen when i was watching the fires and the birds of brave still we are the falling sky i was a believe in the darkness trying to the statue calling of the father and a cars the power in the sky i have seen the streets head into the shadows of the burning sand and the way the falling of the storm in the wind of the streets of the winding wheel to the back the pressure moves the will we started away into the wall with the city where we are gonna fight with a blood of passion and you will be found of your face is still across the sky the trees i am drinking i am asking what is we wanted before the stars there is a bown star of the winding shots and the heart and the sun stars and the stands protects the lights and the days are lands and i am happy to be forever but they have got to be innocents the world is bow to the wind blows where the sky we are still not worried now i will hold on to the world and i am sorry and we are gonna fight we will be hope that is comes and we were walking out in the road the rain and the hands are far the same thing one one by one of the cold there is a fire so much to the sky i have seen them so much to tell us what is is the way that is i see a little corner of the sky and the steel walls they think that is you can hear us all breathing i am looking for you i think i was make us surprised i am not worried now i am not worried now i want no second who wants to see the corner of the far away from everything i have ever wanted when i am screaming i am screaming i am looking i am never going back there i am never going back to the seething sun the way that is change the road i do not know what is the coming skies and drums they live by the back i started at the bar and the grey i will be some place that is i can hear us all breathing i can hear us all breathing i can hear us all breathing i can hear us all breathing i can hear us all breathing i swear that is i have never been and the far horizon it could be stormclouds and it was we are still here the hours in the flag and the hands are frozen the stars below the doors we were out of sun on the corner and the things that is i do not really want it i can hear us all breathing i can hear us the passion is not where i am warming i am sick of the conversations we are gonna die the way they should not believe it still you have been so well you want to be here the sound of the war we have done we are drifting and here in the world in the sun i was not a more than they can look up and i will be some own while the clocks and the great car is all come to see a man fould on the world we were born in the blood dream i can hear us all breathing i am swilling in the end we stand on the top of the water the shadows of the sun and the cheap coats of the cooling towers and the wind blows through the side of the call train we are gonna fight we watch the world while they wait for the crowds and the chemical walls that is we are not war and our hearts they fall the long gods take their children and the lost the steps from the path of the shadows of the sun on the road and all the shadows come crashing in and staring and freedom and the grey and the grass when we go out there is a passion past the blood sun and the falling of the gill of the fire and the back in the wind blows through the backstreets of the desert that is we are still behind us and i am part of us the things that is you were not of the discotheque and the faces call in the sky in the blazing sun and the sun from the sky i have still the strung place that is you want to look down to the sea on the chest of the millions of the cold cold the dance glow their dead and the sun we are gonna fight we were found on the dead we are we are not a great laugh and laughing down the sky i swear that is we are gonna fight we are still the pains that is i have seen to think that is we have done with their faces are the same the passion in the sea and the right of the miles of the faces that is we have to be in the bar and the fead and the waiting lights the storm when we were strange to see a man fall or to see him fly the city the spell blew the cold charge there is words are like with the world and the rooms are still spark in the empty streets there are gone before we will be hard and gone when we said that is we can hear us all breathing i am looking for you i am alone and the more and the earth start and the end it takes all of them will come the same our little shoulder sun and still the black and the lights go out the way they are gone feel the scars and the money i am sorry and the fire and the world is never good faces they are gone before and the noise with the hollows and the rain and the hard of the cardy blows when the hands are the same in something i am one of the blood where the sky i have something to come i am finding i am die and cheat of the days of the sun in the courtry of the water close my hunger the killing day and the grey and the streets are out and the breathing of the end the world is gone but the race is can go out here and i can feel the stars put out of the sky i am drifting out in our little time and the streets are lighting the desert their many counting down the streets are in our reason conversations to the faces come back there is a car and there is a purity that is we can feel my hands and i will meet you in just to spend the bridge and the black and the sea and the rain and the back and still we are born in the power in the frozen claim and i will be standing there i will come the stars there is a siren passion street with more than i can hear myself from the great braves shot and the spoilt generation and i am back and while you will still the same thing that is i see them in a part of us but you are still here for the shadows of the water the wind blows through the trees and the shadows of the water the heart of the flags are the devil in the country the streets and the television promises staring out of the crowd it was just the world while they are so lost i am sorry there i have been so much to talk to the beautiful car and the truth is a frightened days to the push of the sky in the space of the ash is a curse the boys are dancing the stags are black dark and the open driving sun in the sky i saw the way that is i do not really want to have to call the desert in the country from the lights beneath the clouds of the construction of the foot scare and the fire and the angry gods we would be here and we will see them in the party is so much to the way i was driving fast and the world and the bad our breathing sky the statues beneath the clouds of silence and the corner of the glamour breaking the same as the scands the blood and the grey south and the scream of the discotheque of the flashing days that is i cannot see what is i have seen more tonight i am sick of the power i am not and nothing was worthy and i know i will be a sen for a city the world has the same and i will tell you what is i have been and the silence she says and i will see them we were born in the chemical trucks close the darkness and the great roads and the city the sun and the streets are billion stars and the seasons change and there is a children who think they should not do that is what is i am not warming the crackles of pain and we blood and still here all the drops at the world in the sky in the battle of the morning the shadows come the sea and the falling of the water wheels the truth is river on the world is die in the shadow of the streets of her hands are lands and the promises stars beneath the black the crackle and the mountains and the lights below the streets are laid the same and the hard storm are all the seasons the way that is you love me in the world i want the bottle of the passion in the empty power is the power and the sea and the hands are falling stars behind the road and the seasons change and there is a burnt of my heart behind the truth is the same on the streets are living and the back the fire but i can feel it all of us to go confasy the camera while they will be home but you will embrace it and i am still that is you are not a power i was not the promises i saw the way that is i am never going back there i will be mountains and there is nothing to do you can watch your eyes and i will never be stormclouds and the time but there is no warning all the things that is i can hear myself breathing i see the smoke and the lights are sard and the changes and the rain and the seasons change only the seasons change only the wind blows through the silence stars and the way the boys go when it is all vanity i feel like the time is standing here for the ones that is we are better than them with the sky in the moon and the sea and the man from the days are gone before the new world it is not a west for the consuming flags and the gravesing streets and the call and the shoulders and the long land closer to the sky i am lost the fashion and the back and the sea as the prayer and the stains they are still the bridges are cracking on the sea and the bridges are not a black the world we will meet the changes we are still still here and we are gonna die i want

Categories: Uncategorized Tags: new model army, Recurrent Neural Networks, rnn

On Beyond OWL: challenges for ontologies on the Web by James Hendler

October 11, 2015 John G Leave a comment

Categories: linked data, Semantic Web Tags: linked data, Ontologies, Ontology, OWL, Semantic Web

Benford’s Law and the Administrative Geography of Great Britain

July 13, 2014 John G Leave a comment

Just listened to the latest episode of the Infinite Monkey Cage, and was reminded of Benford’s Law. This states:

Benford’s Law, also called the First-Digit Law, refers to the frequency distribution of digits in many (but not all) real-life sources of data. In this distribution, the number 1 occurs as the leading digit about 30% of the time, while larger numbers occur in that position less frequently: 9 as the first digit less than 5% of the time. Benford’s Law also concerns the expected distribution for digits beyond the first, which approach a uniform distribution.

I was curious if that might emerge in geography (or Ordnance Survey data) somehow. Turns out if we look at the areas (in metres squared) of the polygons in the Boundary Line Product (i.e. the areas of all the counties, wards, consistuencies, districts, parishes etc. in GB) then we get a pretty good fit. In the table below the first column is the leading digit of the polygon area, the second is the percentage of areas starting with that leading digit and the third column is the value Benford’s Law predicts:

1: 30.6   30.1
2: 15.9   17.6
3: 11.3   12.5
4: 9.8     9.7
5: 8        7.9
6: 7.3     6.7
7: 6.3     5.8
8: 5.6     5.1
9: 4.9    4.6

Not bad…

Categories: Uncategorized Tags: benford's law, maths, Ordnance Survey, statistics

Quick Play with Cayley Graph DB and Ordnance Survey Linked Data

June 29, 2014 John G 2 comments

Earlier this month Google announced the release of the open source graph database/triplestore Cayley. This weekend I thought I would have a quick look at it, and try some simple queries using the Ordnance Survey Linked Data.

Cayley is written in Go, so first I had to download and install that. I then downloaded Cayley from here. As an initial experiment I decided to use the Boundary Line Linked Data, and you can grabbed the data as n-triples here. I only wanted a subset of this data – I didn’t need all of the triplestores storing the complex boundary geometries for my initial test so I discarded the files of the form *-geom.nt and the files of the form county.nt, dbu.nt etc. (these are the ones with the boundaries in). Finally I put the remainder of the data into one file so it was ready to load into Cayley.

It is very easy to load data into Cayley – see the getting started section part on the Cayley pages here. I decided I wanted to try the web interface so loading the data (in a file called all.nt) was a simple case of typing:

./cayley http –dbpath=./boundaryline/all.nt

Once you’ve done this point your web browser to http://localhost:64210/ and you should see something like:

One of the things that will first strike people used to using RDF/triplestores is that Cayley does not have a SPARQL interface, and instead uses a query language based on Gremlin. I am new to Gremlin, but seems it has already been used to explore linked data – see blog from Dan Brickley from a few years ago.

The main purpose of this blog post is to give a few simple examples of queries you can perform on the Ordnance Survey data in Cayley. If you have Cayley running then you can find the query language documented here.

At the simplest level the query language seems to be an easy way to traverse the graph by starting at a node/vertex and following incoming or outgoing links. So to find All the regions that touch Southampton it is a simple case of starting at the Southampton node, following a touches outbound link and returning the results:

g.V(“http://data.ordnancesurvey.co.uk/id/7000000000037256“).Out(“http://data.ordnancesurvey.co.uk/ontology/spatialrelations/touches“).All()

Giving:

If you want to return the names and not the IDs:

g.V(“http://data.ordnancesurvey.co.uk/id/7000000000037256“).Out(“http://data.ordnancesurvey.co.uk/ontology/spatialrelations/touches“).Out(“http://www.w3.org/2000/01/rdf-schema#label“).All()

You can used also filter – so to just see the counties bordering Southampton:

g.V(“http://data.ordnancesurvey.co.uk/id/7000000000037256“).Out(“http://data.ordnancesurvey.co.uk/ontology/spatialrelations/touches“).Has(“http://www.w3.org/1999/02/22-rdf-syntax-ns#type“,”http://data.ordnancesurvey.co.uk/ontology/admingeo/County“).Out(“http://www.w3.org/2000/01/rdf-schema#label“).All()

The Ordnance Survey linked data also has spatial predicates ‘contains’, ‘within’ as well as ‘touches’. Analogous queries can be done with those. E.g. find me everything Southampton contains:

g.V(“http://data.ordnancesurvey.co.uk/id/7000000000037256“).Out(“http://data.ordnancesurvey.co.uk/ontology/spatialrelations/contains“).Out(“http://www.w3.org/2000/01/rdf-schema#label“).All()

So after this very quick initial experiment it seems that Cayley is very good at providing an easy way of doing very quick/simple queries. One query I wanted to do was find everything in, say, Hampshire – the full transitive closure. This is very easy to do in SPARQL, but in Cayley (at first glance) you’d have to write some extra code (not exactly rocket science, but a bit of a faff compared to SPARQL). I rarely touch Javascript these days so for me personally this will never replace a triplestore with a SPARQL endpoint, but for JS developers this tool will be a great way to get started with and explore linked data/RDF. I might well brush up on my Javascript and provide more complicated examples in a later blog post…

Categories: Semantic Web Tags: Cayley, Cayley Graph, Google, graph data, linked data, Ordnance Survey, Semantic Web

Visualising the Location Graph – example with Gephi and Ordnance Survey linked data

March 28, 2014 John G 2 comments

This is arguably a simpler follow up to my previous blog post, and here I want to look at visualising Ordnance Survey linked data in Gephi. Now Gephi isn’t really a GIS, but it can be used to visualise the adjacency graph where regions are represented as nodes in a graph, and links represent adjacency relationships.

The approach here will be very similar to the approach in my previous blog. The main difference is that you will need to use the Ordnance Survey SPARQL endpoint and not the DBpedia one. So this time in the Gephi semantic web importer enter the following endpoint URL:

http://data.ordnancesurvey.co.uk/datasets/os-linked-data/apis/sparql

The Ordnance Survey endpoint returns turtle by default, and Gephi does not seem to like this. I wanted to force the output as XML. I figured this could be done in the using a ‘REST parameter name’ (output) with value equal to xml. This did not seem to work, so instead I had to do a bit of a hack. In the ‘query tag…’ box you will need to change the value from ‘query’ to ‘output=xml&query’. You should see something like this in the Semantic Web Importer now:

Now click on the query tab. If we want to, for example, view the adjacent graph for consistuencies we can enter the following query:

prefix gephi:<http://gephi.org/>
construct {
?s gephi:label ?label .
?s gephi:lat ?lat .
?s gephi:long ?long .
?s <http://data.ordnancesurvey.co.uk/ontology/spatialrelations/touches> ?o .}
where
{
?s a <http://data.ordnancesurvey.co.uk/ontology/admingeo/WestminsterConstituency> .
?o a <http://data.ordnancesurvey.co.uk/ontology/admingeo/WestminsterConstituency> .
?s <http://data.ordnancesurvey.co.uk/ontology/spatialrelations/touches> ?o .
?s <http://www.w3.org/2000/01/rdf-schema#label> ?label .
?s <http://www.w3.org/2003/01/geo/wgs84_pos#lat> ?lat .
?s <http://www.w3.org/2003/01/geo/wgs84_pos#long> ?long .
}

and click ‘run’. To visualise the output you will need to follow the exact same steps mentioned here (remember to recast the lat and long variables to decimal).

If we want to view adjacency of London Boroughs then we can do this with a similar query:

prefix gephi:<http://gephi.org/>
construct {
?s gephi:label ?label .
?s gephi:lat ?lat .
?s gephi:long ?long .
?s <http://data.ordnancesurvey.co.uk/ontology/spatialrelations/touches> ?o .}
where
{
?s a <http://data.ordnancesurvey.co.uk/ontology/admingeo/LondonBorough> .
?o a <http://data.ordnancesurvey.co.uk/ontology/admingeo/LondonBorough> .
?s <http://data.ordnancesurvey.co.uk/ontology/spatialrelations/touches> ?o .
?s <http://www.w3.org/2000/01/rdf-schema#label> ?label .
?s <http://www.w3.org/2003/01/geo/wgs84_pos#lat> ?lat .
?s <http://www.w3.org/2003/01/geo/wgs84_pos#long> ?long .
}

When visualising you might want to change the scale parameter to 10000.0. You should see something like this:

So far so good. Now imagine we want to bring in some other data – recall my previous blog post here. We can use SPARQL federation to bring in data from other endpoints. Suppose we would like to make the size of the node represent the ‘IMD rank‘ of each London Borough…we can do with by bringing in data from the Open Data Communities site:

prefix gephi:<http://gephi.org/>
construct {
?s gephi:label ?label .
?s gephi:lat ?lat .
?s gephi:long ?long .
?s gephi:imd-rank ?imdrank .
?s <http://data.ordnancesurvey.co.uk/ontology/spatialrelations/touches> ?o .}
where
{
?s a <http://data.ordnancesurvey.co.uk/ontology/admingeo/LondonBorough> .
?o a <http://data.ordnancesurvey.co.uk/ontology/admingeo/LondonBorough> .
?s <http://data.ordnancesurvey.co.uk/ontology/spatialrelations/touches> ?o .
?s <http://www.w3.org/2000/01/rdf-schema#label> ?label .
?s <http://www.w3.org/2003/01/geo/wgs84_pos#lat> ?lat .
?s <http://www.w3.org/2003/01/geo/wgs84_pos#long> ?long .
SERVICE <http://opendatacommunities.org/sparql> {
?x <http://purl.org/linked-data/sdmx/2009/dimension#refArea> ?s .
?x <http://opendatacommunities.org/def/IMD#IMD-score> ?imdrank . }
}

You will need to recast the imdrank as an integer for what follows (do this using the same approach used to recast the lat/long variables). You can now use Gephi to resize the nodes according to IMD rank. We do this using the ranking tab:

You should now see you London Boroughs re-sized according to their IMD rank:

turning the lights off and adding some labels we get:

Categories: linked data, Semantic Web Tags: gephi, linked data, Ordnance Survey

All roads lead to? Experiments with Gephi, Linked Data and Wikipedia

March 26, 2014 John G 3 comments

Gephi is “an interactive visualization and exploration platform for all kinds of networks and complex systems, dynamic and hierarchical graphs”. Tony Hirst did a great blog post a while back showing how you could use Gephi together with DBpedia (a linked data version of Wikipedia) to map an influence network in the world of philosophy. Gephi offers a semantic web plugin which allows you to work with the web of linked data. I recommend you read Tony’s blog to get started with using that plugin with Gephi. I was interested to experiment with this plugin, and to look at what sort of geospatial visualisations could be possible.

If you want to follow all the steps in this post you will need to:

Initially I was interested to see if there were any interesting networks we might visualise between places. In order to see how Wikipedia relates one place to another was a simple case of going to the DBpedia SPARQL endpoint and trying the following query:

select distinct ?p
where
{
?s a <http://schema.org/Place> .
?o a <http://schema.org/Place> .
?s ?p ?o .
}

– where s and o are places, find me what ‘p’ relates them. I noticed two properties ‘http://dbpedia.org/ontology/routeStart‘ and ‘http://dbpedia.org/ontology/routeEnd‘ so I thought I would try to visualise how places round the world were linked by transport connections. To find places connected by a transport link you want to find pairs ‘start’ and ‘end’ that are the route start and route end, respectively, of some transport link. You can do this with the following query:

select ?start ?end
where
{
?start a <http://schema.org/Place> .
?end a <http://schema.org/Place> .
?link <http://dbpedia.org/ontology/routeStart> ?start .
?link <http://dbpedia.org/ontology/routeEnd> ?end .
}

This gives a lot of data so I thought I would restrict the links to be only road links:

select ?start ?end
where
{?start a <http://schema.org/Place> .
?end a <http://schema.org/Place> .
?link <http://dbpedia.org/ontology/routeStart> ?start .
?link <http://dbpedia.org/ontology/routeEnd> ?end .
?link a <http://dbpedia.org/ontology/Road> . }

We are now ready to visualise this transport network in Gephi. Follow the steps in Tony’s blog to bring up the Semantic Web Importer. In the ‘driver’ tab make sure ‘Remote – SOAP endpoint’ is selected, and the EndPoint URL is http://dbpedia.org/sparql. In an analogous way to Tony’s blog we need to construct our graph so we can visualise it. To simply view the connections between places it would be enough to just add this query to the ‘Query’ tab:

construct {?start <http://foo.com/connectedTo> ?end}
where
{
?start a <http://schema.org/Place> .
?end a <http://schema.org/Place> .
?link <http://dbpedia.org/ontology/routeStart> ?start .
?link <http://dbpedia.org/ontology/routeEnd> ?end .
?link a <http://dbpedia.org/ontology/Road> .
}

However, as we want to visualise this in a geospatial context we need the lat and long of the start and end points so our construct query becomes a bit more complicated:

prefix gephi:<http://gephi.org/>
construct {
?start gephi:label ?labelstart .
?end gephi:label ?labelend .
?start gephi:lat ?minlat .
?start gephi:long ?minlong .
?end gephi:lat ?minlat2 .
?end gephi:long ?minlong2 .
?start <http://foo.com/connectedTo> ?end}
where
{
?start a <http://schema.org/Place> .
?end a <http://schema.org/Place> .
?link <http://dbpedia.org/ontology/routeStart> ?start .
?link <http://dbpedia.org/ontology/routeEnd> ?end .
?link a <http://dbpedia.org/ontology/Road> .
{select ?start (MIN(?lat) AS ?minlat) (MIN(?long) AS ?minlong) where {?start <http://www.w3.org/2003/01/geo/wgs84_pos#lat> ?lat . ?start <http://www.w3.org/2003/01/geo/wgs84_pos#long> ?long .} }
{select ?end (MIN(?lat2) AS ?minlat2) (MIN(?long2) AS ?minlong2) where {?end <http://www.w3.org/2003/01/geo/wgs84_pos#lat> ?lat2 . ?end <http://www.w3.org/2003/01/geo/wgs84_pos#long> ?long2 .} }
?start <http://www.w3.org/2000/01/rdf-schema#label> ?labelstart .
?end <http://www.w3.org/2000/01/rdf-schema#label> ?labelend .
FILTER (lang(?labelstart) = ‘en’)
FILTER (lang(?labelend) = ‘en’)
}

Note that query for the lat and long is a bit more complicated that it might be. This is because DBpedia data is quite messy, and many entities will have more than one lat/long pair. I used a subquery in SPARQL to pull out the minimum lat/long for all the pairs retrieved. Additionally I also retrieved the English labels for each of the start/end points.

Now copy/paste this construct query into the ‘Query’ tab on the Semantic Web Importer:

Now hit the run button and watch the data load.

To visual the data we need to do a bit more work. In Gephi click on the ‘Data Laboratory’ and you should now see your data table. Unfortunately all of the lats and longs have been imported as strings and we need to recast them as decimals. To do this click on the ‘More actions’ pull down menu and look for ‘Recast column’ and click it. In the ‘Recast manipulator’ window go to ‘column’ and select ‘lat(Node Table)’ from the pull down menu. Under ‘Convert to’ select ‘Double’ and click recast. Do the same for ‘long’.

when you are done click ‘ok’ and return to the ‘overview’ tab in Gephi. To see this data geospatially go to the layout panel and select ‘Geo Layout’. Change the latitude and longitude to your new recast variable names, and unclick ‘center’ (my graph kept vanishing with it selected). Experiment with the scale value:

You should now see something like this:

in your display panel (click image to view in higher resolution).

Given that this is supposed to be a road network you will find some oddities. This it seems to down to ‘European routes’ like European route E15 that link from Scotland down to Spain.

Categories: linked data Tags: dbpedia, gephi, linked data, wikipedia

First Signs (For Me) of Linked Data Being Properly Linked…?!

March 25, 2014 John G Leave a comment

Tony Hirst blogs about two of my recent blogs…

OUseful.Info, the blog...

As anyone who’s followed this blog for some time will know, my relationship with Linked Data has been an off and on again one over the years. At the current time, it’s largely off – all my OpenRefine installs seem to have given up the ghost as far as reconciliation and linking services go, and I have no idea where the problem lies (whether with the plugins, the installs, with Java, with the endpoints, with the reconciliations or linkages I’m trying to establish).

My dabblings with pulling data in from Wikipedia/DBpedia to Gephi (eg as described in Visualising Related Entries in Wikipedia Using Gephi and the various associated follow-on posts) continue to be hit and miss due to the vagaries of DBpedia and the huge gaps in infobox structured data across Wikipedia itself.

With OpenRefine not doing its thing for me, I haven’t been able to use that app as…

View original post 346 more words

Categories: Uncategorized

Tell Me About Hampshire – Linking Government Data using SPARQL federation 2

March 23, 2014 John G 3 comments

Yesterday I blogged about how to do some SPARQL federated queries across various government websites, and this blog is a continuation of this with a different example. In this blog I give an example query which basically say ‘tell me stuff about Hampshire‘. I do this by linking up data from Ordnance Survey, the Office of National Statistics, the Department of Communities and Local Government and Hampshire County Council. This query is really just for illustrative purposes, but I want to ask ‘for all districts in Hampshire find me the index of multiple deprivation rank, the change order and operative date for that district, the website for the local authority of that district along with the addresses of parcels of land where it is planned to build new dwellings. To achieve this I need to take data from several sources and use SPARQL federation. Here is the query that answers my question. First I query Ordnance Survey linked data to find districts in Hampshire, and I then pass these districts to three other linked data services to retrieve the relevant information. To try this example head over to the Ordnance Survey SPARQL endpoint and copy/paste the following:

select ?districtname ?imdrank ?changeorder ?opdate ?councilwebsite ?siteaddress
where
{?district <http://data.ordnancesurvey.co.uk/ontology/spatialrelations/within>
   <http://data.ordnancesurvey.co.uk/id/7000000000017765> .
?district a <http://data.ordnancesurvey.co.uk/ontology/admingeo/District> .
?district <http://www.w3.org/2000/01/rdf-schema#label> ?districtname .
SERVICE <http://opendatacommunities.org/sparql> {
?s <http://purl.org/linked-data/sdmx/2009/dimension#refArea> ?district .
?s <http://opendatacommunities.org/def/IMD#IMD-rank> ?imdrank .
?authority <http://opendatacommunities.org/def/local-government/governs> ?district .
?authority <http://xmlns.com/foaf/0.1/page> ?councilwebsite .
}
?district <http://www.w3.org/2002/07/owl#sameAs> ?onsdist .
SERVICE <http://statistics.data.gov.uk/sparql> {
?onsdist <http://statistics.data.gov.uk/def/boundary-change/originatingChangeOrder>
          ?changeorder .
?onsdist <http://statistics.data.gov.uk/def/boundary-change/operativedate>
          ?opdate .
}
SERVICE <http://linkeddata.hants.gov.uk/sparql> {
   ?landsupsite <http://data.ordnancesurvey.co.uk/ontology/admingeo/district> ?district .
   ?landsupsite a <http://linkeddata.hants.gov.uk/def/land-supply/LandSupplySite> .
   ?landsupsite
<http://www.ordnancesurvey.co.uk/ontology/BuildingsAndPlaces/v1.1/BuildingsAndPlaces.owl#hasAddress>
   ?siteaddress .
   }
}

Happy SPARQLing…

Categories: linked data, Semantic Web Tags: linked data, open data

Federating SPARQL Queries Across Government Linked Data

March 22, 2014 John G 2 comments

SPARQL 1.1 introduces the idea of federated SPARQL queries – this enables you to execute part of your SPARQL query against a remote SPARQL endpoint. I thought I’d provide some examples of using this feature in government linked open data.

The Environment Agency has published a number of its open data offerings as linked data which you can explore here. One of these datasets is the Bathing Water Quality Data, and you can explore this via their SPARQL endpoint. I won’t go into this data in too much detail as it is not my area of expertise. The Environment Agency has created 5-star open data by linking their data to both Ordnance Survey and Office of National Statistics linked data. Look at linked data for the Eastoke bathing water site and you’ll see it linked to Havant and Hampshire in the Ordnance Survey data. A relatively straight forward SPARQL query will get you a list of bathing waters, their name and the district they are in:

select ?x ?name ?district
where {
?x a <http://environment.data.gov.uk/def/bathing-water/BathingWater> .
?x <http://www.w3.org/2000/01/rdf-schema#label> ?name .
?x <http://statistics.data.gov.uk/def/administrative-geography/district> ?district .}

Now suppose we just want a list of bathing water areas in South East England – how would we do that? This is where SPARQL federation comes in. The information about which European Regions districts are in is held in the Ordnance Survey linked data. If you hop over the the Ordnance Survey SPARQL endpoint explorer you can run the following query to find all districts in South East England along with their names (please see a previous blog post for information about simple spatial queries):

select ?district ?districtname
where
{?district <http://data.ordnancesurvey.co.uk/ontology/spatialrelations/within>+
<http://data.ordnancesurvey.co.uk/id/7000000000041421> .
?district <http://www.w3.org/2000/01/rdf-schema#label> ?districtname .}

Using the SERVICE keyword we can bring these two queries together to find all bathing waters in South East England, and the districts they are in:

select ?x ?name ?districtname
where {
?x a <http://environment.data.gov.uk/def/bathing-water/BathingWater> .
?x <http://www.w3.org/2000/01/rdf-schema#label> ?name .
?x <http://statistics.data.gov.uk/def/administrative-geography/district> ?district .
SERVICE <http://data.ordnancesurvey.co.uk/datasets/boundary-line/apis/sparql>
{ ?district <http://data.ordnancesurvey.co.uk/ontology/spatialrelations/within>+
<http://data.ordnancesurvey.co.uk/id/7000000000041421> .
?district <http://www.w3.org/2000/01/rdf-schema#label> ?districtname .}
}
order by ?districtname

Now supposed we want to know the sediment types of the bathing waters in Havant. We can find this with the following query:

select ?x ?name ?sediment
where {
?x a <http://environment.data.gov.uk/def/bathing-water/BathingWater> .
?x <http://www.w3.org/2000/01/rdf-schema#label> ?name .
?x <http://statistics.data.gov.uk/def/administrative-geography/district> <http://data.ordnancesurvey.co.uk/id/7000000000017297> .
?x <http://environment.data.gov.uk/def/bathing-water/sedimentTypesPresent> ?sediment .
}

We can again use the SPARQL federation to do something more interesting. The follow query returns both sediment types in bathing waters in Havant together with sediment types of bathing water in regions that touch Havant:

select ?x ?name ?sediment
where {
{
?x a <http://environment.data.gov.uk/def/bathing-water/BathingWater> .
?x <http://www.w3.org/2000/01/rdf-schema#label> ?name .
?x <http://statistics.data.gov.uk/def/administrative-geography/district> <http://data.ordnancesurvey.co.uk/id/7000000000017297> .
?x <http://environment.data.gov.uk/def/bathing-water/sedimentTypesPresent> ?sediment .
}
UNION
{
SERVICE <http://data.ordnancesurvey.co.uk/datasets/boundary-line/apis/sparql>
{ ?district <http://data.ordnancesurvey.co.uk/ontology/spatialrelations/touches>
<http://data.ordnancesurvey.co.uk/id/7000000000017297> .
}
?x a <http://environment.data.gov.uk/def/bathing-water/BathingWater> .
?x <http://www.w3.org/2000/01/rdf-schema#label> ?name .
?x <http://statistics.data.gov.uk/def/administrative-geography/district> ?district .
?x <http://environment.data.gov.uk/def/bathing-water/sedimentTypesPresent> ?sediment .
}
}

Another great government open data resource is the Open Data Communities site. They have a SPARQL endpoint here. This federated SPARQL query (analogous to those above) can be used, for example, to find the Index of Multiple Deprivation Environment rank for Havant and surrounding districts. This works are follows:

select ?s ?imdrank
where
{
{
?s <http://purl.org/linked-data/sdmx/2009/dimension#refArea> <http://data.ordnancesurvey.co.uk/id/7000000000017297> .
?s <http://opendatacommunities.org/def/IMD#IMD-environment-rank> ?imdrank .
}
UNION
{
SERVICE <http://data.ordnancesurvey.co.uk/datasets/boundary-line/apis/sparql>
{ ?district <http://data.ordnancesurvey.co.uk/ontology/spatialrelations/touches>
<http://data.ordnancesurvey.co.uk/id/7000000000017297> .
}
?s <http://purl.org/linked-data/sdmx/2009/dimension#refArea> ?district .
?s <http://opendatacommunities.org/def/IMD#IMD-environment-rank> ?imdrank .
}
}

I will now leave it as an exercise to the reader to figure out how these all combine so you can ask for ‘all bathing waters in Havant and surrounding areas, and the IMD environment ranks of the areas containing those bathing waters’ – it is possible!

Please note that federated SPARQL can be slow…happy SPARQLing.

Categories: linked data, Semantic Web

Older Entries

John’s Weblog