Smarter Cambridge Transport

Let’s have a Big Data Dig

Since this article was published, the Greater Cambridge Partnership have released some of the ANPR data on the Cambridgeshire Insight portal. It’s in a partially-processed rather than raw format, which makes further analysis challenging.

The Greater Cambridge Partnership (GCP) has published some preliminary findings from a detailed traffic survey it conducted in June. Nearly 100 Automatic Number Plate Recognition (ANPR) cameras logged traffic on all Cambridge approach roads and other locations within the M11-A14-A11 triangle.

The conclusions published by GCP so far relate to the types of vehicles being driven: 55.5% of vehicles are diesel; 60% of vehicles would face the London Toxicity Charge (£10/day – on top of the £11.50 congestion charge); 12% are vans and lorries. This is worth knowing, but could be determined reasonably accurately by someone noting down registration numbers one morning, looking them up on the DVLC website, and cross-referencing registration months with emissions standard implementation dates.

The data holds much more interesting secrets than this. Although it won’t tell us where people are ultimately driving from and to (origins and destinations), which is what we really need to know for modelling the attractiveness of public transport options, it could answer other useful questions, like:

  • How accurate is data that is currently used to model transport projects?
  • Where are the congestion hot spots at different times of day?
  • Which routes are underused? (Could some traffic be diverted to less congested routes with changes to signage?)
  • Where are traffic signals causing unnecessary congestion?

GCP indicate that they will publish the raw data in 2018. Time is of the essence when expensive transport projects are in the latter stages of planning. So why is it taking so long to release?

The data is essentially a table of 5 million rows, each comprising a time stamp, a location ID (tied to an index of GPS coordinates of the 100 camera locations), a vehicle ID (something more anonymous than the number plate), possibly the direction of travel, and whatever vehicle data has been obtained from DVLC.

5 million may sound daunting and it’s too big for Microsoft Excel, but data scientists have tools to handle much larger data sets with ease. Unleash some volunteers on this, and we could learn something useful – and maybe surprising – sooner than GCP alone will manage.

Emissions standard Applied to most new registrations from
Euro 1 31 December 1992
Euro 2 1 January 1997
Euro 3 1 January 2001
Euro 4 1 January 2006
Euro 5 1 January 2011
Euro 6 1 September 2015

This article was first published in the Cambridge Independent on 8 November 2017.

Edward Leigh

Edward Leigh is the leader of Smarter Cambridge Transport, chair of the South Petersfield Residents Association, independent member of the Cambridgeshire Police and Crime Panel, business owner, consultant, and occasional blogger about making the world and Cambridge a better place to live.

Add comment