Sunday, 15 September 2013

Local Population Tool

Some surveys undertaken by my clients, require to know the resident population within certain distances of a survey site.

There are a number of ways of collating this data, but I decided that by combining the 2011 Census data for England and Wales with the Postcode address location data published by the OS.  I could make this very easy by offering it through my new website.

Here is the local population tool

It allows the user to input either a residential postcode, or a OSX,OSY coordinate, and provides a Google chart showing the radial distance on the X axis, and the residential population on the Y axis.

The way it works is by using a matrix with a population figure for every 100x100m grid square of the area covered.  A circular shape is overlaid at the required location on this grid, as a kind of mask, and the totals for the grid squares in each radius are summed to arrive at the population totals.  A slight adjustment reduces the downsampling error.

This kind if work is easy using Python and NumPy, and providing access through a website is no problem using Django.

Because of the static (2011) source and the resolution of the input data, it's not going to be totally accurate, and my selection of grid resolution introduces further approximation.  The quality of the data however, should be comparable to other similar data sources out there, and more convenient for basic purposes.

In its raw form, the population data was aligned to individual postcodes, which themselves were aligned to specific OSX/OSY centre points, rather than area polygons.  Had time allowed, I would have created polygons, and distributed the populations within those, but instead I applied a smoothing algorithm to the data using SciPy.  My intuitive sense being that this would slightly improve the data quality.

This is what the data looks like when plotted as a heat-map with the Python matplotlib library.

I could have used the matplotlib library for creating the charts in the tool too, but decided that Google Charts had the advantage when it came to interactivity, and simplicity of implementation.

If there ever seems to be a requirement for it, I will add the other data from the census set: the male/female breakdown and number of households.

It would probably also be informative to combine this data with the data used for the BBC's Road Crash Deaths visualisation that was plotted last year, which looks similar.  I may do at some point if time allows.

No comments:

Post a Comment