Friday, 27 September 2013

Origin Destination Data Expansion Tool

If an origin destination survey only collects a sample of the trips made across a cordon area, this data can be combined with total flow counts at each site to produce a complete dataset. This can quite accurately represent the true situation on the ground, and can reduce the costs of carrying out such a survey.

Using ANPR technology, detection rates in excess of 95% can often be achieved (albeit at quite a high cost), and this leaves little for the tool to do, but if you have collected Bluetooth, WiFi or RFID signatures, or adverse weather affects an ANPR survey, the sample rate can be significantly lower. In these cases it is especially useful to be able to easily carry out 'bi-proportional matrix balancing', sometimes known as the 'Furness method'.

I have developed a free online tool to carry out this process.

It is also available in Buchanan Computing's MicroMatch, but this does cost money, and may not always be applicable to your data since it is just intended for registration data.

The process can similarly be applied to data collected through road-side interviews or questionnaires where only a small proportion of travellers provide information about their trips.

The software employs a process similar to that described here, and my implementation uses Python and NumPy.

Another use for the tool is to 'factor up' existing OD data from a previous survey, to meet new total flows which have been collected more recently. There are arguably better approaches to this type of problem however, such as using a gravity model.

For full information about how to use it, please read on, as there is limited information on the webpage.

Two CSV (comma separated values) files are required - see the following examples:

The first is for the total flows into and out of the cordon at each site for each time interval.  The software supports multiple classes, which have a column each, whose header must be labelled. 5, 15 or hourly intervals should work fine, but there must be an entry for each time interval, even if all counts are 0. These values represent what the seed data is to be factored up to (the target values), and will often be ATC data or manually classified vehicle link counts by direction.

The second is for the seed data to be expanded. This could be ANPR, Bluetooth or RFID match data for the cordon area which constitutes a sample of the vehicles using the cordon.

The provided seed data from this file may optionally be fixed so it cannot be expanded. This allows the software to serve a different purpose in 'filling in' OD match pairs where that data is unavailable. This can be the case when large junctions are enumerated from video footage by following vehicles manually through a junction. Occasionally one or more turning movements cannot be seen sufficiently clearly, and this method may provide an acceptable alternative to an expensive resurvey. When using the tool like this, the values for the missing turning movements are seeded with low values, which then hopefully find equilibrium. You may want to run the data through twice, once to fill a missing OD match pair, and then again to adjust for any inaccuracy in the recorded turning counts.

In some cases it may also help to untick the option to expand return trips. This can be useful if you have 95% of the data, and the tool is adding unwanted extra pairs which seem out of place (u-turns at a junction for example). This may happen if there were some problems with the raw match data.

The number of iterations you require depends on how well your dataset is converging, and what characteristics you require from the data. Experimenting with this value is probably a good idea.

Once you click 'Expand Data' the data should be processed within a minute, and you will be presented with several files containing different views of the new dataset.

Processed Data - this is the expanded trip data in the format you supplied
Time Series Data - this is convenient for graphing the distribution of trips over time

Target Match - this shows how closely the seed values converged upon the target values
Processed Data Summary - this aggregates the Processed Data over the whole survey period

I would welcome any comments of suggestions regarding this tool, and if you have data in a format that you cannot convert to the required format, please let me know, and if time allows, I may add support for it.

If your requirements are more complicated than this, I would also be happy to discuss them with you.

Sunday, 15 September 2013

Local Population Tool

Some surveys undertaken by my clients, require to know the resident population within certain distances of a survey site.

There are a number of ways of collating this data, but I decided that by combining the 2011 Census data for England and Wales with the Postcode address location data published by the OS.  I could make this very easy by offering it through my new website.

Here is the local population tool

It allows the user to input either a residential postcode, or a OSX,OSY coordinate, and provides a Google chart showing the radial distance on the X axis, and the residential population on the Y axis.

The way it works is by using a matrix with a population figure for every 100x100m grid square of the area covered.  A circular shape is overlaid at the required location on this grid, as a kind of mask, and the totals for the grid squares in each radius are summed to arrive at the population totals.  A slight adjustment reduces the downsampling error.

This kind if work is easy using Python and NumPy, and providing access through a website is no problem using Django.

Because of the static (2011) source and the resolution of the input data, it's not going to be totally accurate, and my selection of grid resolution introduces further approximation.  The quality of the data however, should be comparable to other similar data sources out there, and more convenient for basic purposes.

In its raw form, the population data was aligned to individual postcodes, which themselves were aligned to specific OSX/OSY centre points, rather than area polygons.  Had time allowed, I would have created polygons, and distributed the populations within those, but instead I applied a smoothing algorithm to the data using SciPy.  My intuitive sense being that this would slightly improve the data quality.

This is what the data looks like when plotted as a heat-map with the Python matplotlib library.

I could have used the matplotlib library for creating the charts in the tool too, but decided that Google Charts had the advantage when it came to interactivity, and simplicity of implementation.

If there ever seems to be a requirement for it, I will add the other data from the census set: the male/female breakdown and number of households.

It would probably also be informative to combine this data with the data used for the BBC's Road Crash Deaths visualisation that was plotted last year, which looks similar.  I may do at some point if time allows.

Monday, 5 August 2013

Portable Traffic Surveillance from £160 per Unit

Recently client demand and personal interest led me to design/develop a cheap, easily installable and portable unit for recording video, which can be mounted directly on lighting columns and other suitable street furniture.

I had looked around for off-the-shelf devices that could serve this purpose, but they were all deficient in one area or another, be it battery life, recording format or storage capacity.

My solution is based on the Android smartphone platform, but the choice of device would depend upon budget and the precise application requirements.

Here are some general system specifications:
  • fully weatherproof
  • portable unit weighs 900g
  • continuous recording for up to 24 hours - or timed recording
  • periodic unit status and image updates over the mobile phone network
  • web-based interface for viewing and control of unit
  • uses the h264 video codec
  • USB data interface
The application determines which smartphone is most suitable for each survey type.  Suitable phones start at around £100.

Star N9770 (£100) and HTC One V (£170)

The cheap phone I initially used is based on the Dual Core Arm Cortex A9.  The camera module is not of high quality, and though fine during daylight hours, during periods of low-light, the frame rate is reduced to 8fps. The HTC phone costs a little more, but offers significantly better low-light video quality, and faster shutter speeds.  Outside the hours of daylight, the HTC phone provides acceptable quality footage, whereas the Star N9770 does not.

Software Development

My preferred programming language is Python, and where possible I would always choose this over Java, but for the smartphone software I decided that using Java (Android's standard development language) would give me easy access to the core libraries required for this type of application.

It was relatively easy to implement this functionality, so the verbosity of Java didn't cause me too much frustration, and there were only a couple of challenges that required some deeper consideration.
  1. Since the application needs to run for an extended period, I needed to allow the recording process to continue whilst the screen was switched off.
  2. Because I wanted regular updates to be sent to my web server for viewing/control, I needed to establish how to create snapshot images and upload them over the mobile phone network without impeding the video recording too severely.
The web server software which allows the periodic status and snapshot image uploads to be easily viewed from a web-based interface was implemented using Django and Python, and was also a fairly trivial exercise.

Hardware Development

In order to protect the device from the elements, I decided to use a polycarbonate IP66 rated (weatherproof) enclosure, and mount the phone and other parts inside in such a way that they can be fairly easily replaced.  This is especially important when the enclosures themselves constitute a significant part of the unit cost, and when smartphone devices and their cameras are improving all the time.

To ensure there was sufficient power for a typical survey, I added a USB battery portable pack that provided an extra 12000ma without adding too much to the weight or volume of the unit.

My approach was to drill a small hole in the back of the enclosure, where the camera sensor would be, and replace it with a small piece of glass, this was glued with silicone sealant to achieve a weatherproof seal.  The transparent lid allows a better look at the internal layout, but the production units are fully opaque as this looks better, and won't get so hot on sunny days!

A few options for mounting the enclosure were considered, but I settled on a mechanism that allows the enclosure to be clamped on a lighting column using two light telescopic poles.  This allows the unit to be mounted at up to 4.5m which means views are largely free from vehicular obstructions.

This spring clamp allows quick installation (around 30 seconds) on lighting columns up to about 90mm in diameter.

I subsequently also added a basic shield in the form of a small section of plastic pipe to the enclosure to protect the sensor area from rain and sun glare.

Video Processing

Once the video footage has been recorded, it can be easily extracted from the phone within the enclosure using either USB or wirelessly, and processed to form video that can be used either within the advanced enumeration software which I have also developed (, or in some cases within machine vision software that I am working on.


These units are to be used for various survey purposes ranging from vehicle turning counts and queue length surveys to cycle and pedestrian surveys.  The ease of installation and low cost means they can also be deployed to reduce the impact of any other hardware or equipment failure, or even to monitor other equipment for vandalism.