Monday, 1 November 2010

15 minutes of brain

All it takes is 15 minutes.
  • Go to data.gov.uk
  • Look at the most recently uploaded dataset
  • Happens to be Local police recorded crime data, a breakdown force by force, offence type by offence from 1 March, 2002 until 1 March 2010 or thereabouts.
  • Download the .csv
  • Open the .csv
  • Filter the data by Offence Group (in this case Drug Offences)
  • No need to filter the data by Offence Sub-Group in this case, but you can
  • C & P out into another Excel tab the Offence data
  • Filter the data byYear (in this case 12 months to 1 March, 2003)
  • C & P out into another Excel tab the years data (still only showing drug offences)
  • Delete 12 months to column
  • Delete Region Name column
  • Delete Offence Group column
  • Delete Offence Sub-Group column
  • Highlight the two remaining columns and all data
  • Create a graph
  • Make the graph look pretty
  • Upload it to ScribD
  • Embed it in your blog
15 minutes and I know Devon and Cornwall have a disproportionately high drug offence rate considering their population, in comparison to their surrounding neighbours of Dorset and Avon & Somerset (yes I know Avon doesn't exist as a county any more, they're slow to catch up down there).

Crime Stats From Datadotgovdotuk




You might need to download the spreadsheet or view it over at ScribD - I'm on a 10" netpad and it's not so great at displaying things properly.

This is why data.gov.uk is one of the best websites on the internet. This is why open data is a wonderful thing. This is what can be done when youn open up your data. When I've got more time, I'll cross reference this info with population info of each of the areas the forces cover. Then I might look at deprivation indices inside those areas. Or possibly, you know, someone else might who can make the data look prettier because, after all, this was a 15 minutes job. This is not me showing off. This is merely a 'this is what is possible in 15 minutes'.

My father always told me curiosity killed the cat. No it doesn't. It opens up whole new worlds of information and it kills no one to stick it in a .csv and give it to people like me to play with.

3 comments:

  1. Can I ask how you can assess "Devon and Cornwall have a disproportionately high drug offence rate considering their population" if the population normalisation bit hasn't been done yet? If the crime data set you're using already shows adjustment for population, I withdraw, but as far as I can see what you did was to plot a bar chart of offences by police force. Which though clearly of some mild statistical interest isn't the transformative leap to citizen empowerment and understanding that many clearly would like to imagine that it is.

    ReplyDelete
  2. Chris> I'll show you how on Sat. Honest, anyone can do it, you just got to be a bit familiar with filtering and graph making, both of which are cool tools to know.

    Anon> Cos I grew up down that way and therefore know quite a lot about the populationn spread of those counties South and West of Bristol. However, if you would like to argue with me that Devon and Cornwall don't appear to have an unexplainable spike in their stats, feel free. I like a good argument.

    I was trying to show how simple it was, not explain the total worth of open data. If you identify yourself, I am more than happy to send you a statistical analysis of Devon & Cornwalls population crossed with drug offences per head.

    I suspect it's also something to do with it's coastline and it's rural nature meaning things can be grown/brewed/created in peace and quiet.

    Just to make it _absolutely_ clear, I am not trying to say this is world changing stuff, I am saying this is how easy it is, release data and people with more time than me and who are better than me will come do cool things with it.

    But, you know, ta for missing the point ;O) And, really, I'd respect your opinion a lot more if you weren't anon.

    ReplyDelete