The beginnings of Open Data

Open sign

The value of open data in the commercial sector

The Open Data movement has been around for decades, advocating that scientific and public datasets be made freely available so that the latest technologies can be applied to gain insight, further enabling social and economic development.

As early as the 1950s, large numbers of scientists were calling for scientific datasets to be made accessible to all. They saw the potential for not only verifying scientific claims but also new discoveries resulting from bringing different data sources together.

The earth sciences led the way, since in order to make scientific claims on a global scale, they needed access to the likes of meteorological and seismological data from all over the world. Because of this need, the World Data Center system was established in 1958.

By the time the Human Genome Project kicked off in 1990, open data was a core component. During the 13 years that the project ran, the emerging internet made the process of publishing and accessing open data much faster, easier and cheaper. The (almost) complete human DNA sequence is now available to anyone, promising benefits to virologists, oncologists, pharmacologists, forensic scientists and more.

In 2004, all the OECD countries signed up to make their publicly funded scientific research data freely accessible.

Opening up government data

Ahead of the curve, Prime Minister Gordon Brown introduced the Open Government Licence in the UK in 2010, as well as the website, a repository of public sector datasets. This made huge amounts of data previously only accessible within government available to anyone with a web browser. The idea was that access to this data would help businesses, improve public services and empower citizens to make data-driven decisions.

“It’s such an untapped resource,” said World Wide Web inventor Sir Tim Berners-Lee, who oversaw the project. “Government data is something we have already spent the money on… and when it is sitting there on a disk in somebody’s office it is wasted.”

The UK currently ranks in second place on the World Wide Web Foundation’s Open Data Barometer, which lists governments based on the openness and accessibility of their data.

Initially there was just a handful of services built on the datasets – one for reporting road hazards using ONBS location data, for example, and another for finding planning applications on local authority websites. These days though, there are countless apps and services making use of over 45,000 government datasets, many of which are provided by the commercial sector.

Transport for all

The Citymapper app, for example, uses Transport for London (TfL) data to help users navigate their way around the capital by showing them the fastest routes, the cost of using public transport, the number of calories burned by walking or cycling, and so on.

Having trained their algorithm on London transport data, the developer was then able to apply it to datasets from other municipalities. The app is now live in 39 cities around the world and, testifying to the power of open data for kickstarting economic activity, has much bigger plans. This is from Forbes’ coverage of The Telegraph Smart Cities Conference in September 2018:

Omid Ashtari explained how Citymapper is going from an app company that looks at data to running transport in its war against the single occupancy car. Citymapper has become incredibly adept at cleaning data, amalgamating it from multiple sources and combining it with data generated from its own users’ movements and has monetised this with ‘Smartride’ – a cross between a bus and a taxi. You don’t book the vehicle but a seat within it, and it doesn’t come to your door but meets you at a street corner close to where you are but without diverting too much from the journey the other occupants are making.”

One of the clearest examples of how open data can help generate economic growth is the Global Positioning System (GPS), developed by the US Department of Defence in the 1970s and originally intended only for military use. As it was made increasingly available for civilian use over the following decades, almost every industry has come to rely on GPS data in one way or another, to the point where it is estimated that discontinuing the service would lead to the destruction of $96bn of value.

Saving money, saving lives

In the developing world, open data can have an even more transformative, and sometimes life-saving, role. The response to the 2014 Ebola outbreak in West Africa relied heavily on open datasets for mapping the threat and coordinating resources. In Ghana, a company called Esoko is using open government data, in combination with other sources, to help small scale farmers level the playing field in negotiations with buyers and get a better price for their produce.

With the explosion of Internet of Things (IoT) sensors, the amount of data available to governments, both local and national, will increase exponentially. While this creates the opportunity for data-driven solutions to a range of urban and infrastructure planning challenges, it also raises privacy questions. The Open Data Institute is piloting two ‘data trust’ projects as a potential way to increase access to data while safeguarding the privacy of the individuals who create it.

Should commercial data be open?

The advantages to the public sector of opening up data are clear. The value created by commercial or civil society organisations using and augmenting the datasets is returned to government in the form of higher economic growth, higher tax revenues and lowered administrative burdens.

So could the commercial sector also benefit from a similar virtuous circle, leading from data to insight to value and back to data? Some innovative organisations are already applying open data licences to selected datasets to reap the benefits of engaging the hive mind.

Nike created the Materials Sustainability Index (MSI), a database which allows them to compare the sustainability of production materials from a huge range of suppliers, along with a publicly accessible API. It’s since been picked up by the Sustainable Apparel Coalition and generates value for the entire industry, helping Nike keep up with increasing demand for environmentally sound products.

Since 2009, The Guardian newspaper has published raw data, including all Guardian content, via a public API. This allows app developers to serve content in return for carrying the newspaper’s advertising, which provides an additional source of revenue.

Media and information company Thomson Reuters’ solution to an internal problem – connecting datasets from around the organisation – has now been published under an open licence. This allows customers to benefit from a permanent, machine-readable identifier that provides a unique reference for a wide variety of entity types including organisations, instruments, funds, issuers and people. It also helps embed them in the Thomson Reuters ecosystem of products.

If you’re wondering how much value an eye for openness can really generate, consider the story of Amazon. The astronomical growth of the company from minor bookseller to world-leading cloud computing company was fuelled in no small part by a mandate issued to staff by Jeff Bezos around 2002. According to a former Amazon engineer it went something like this:

• All teams will henceforth expose their data and functionality through service interfaces

• Teams must communicate with each other through these interfaces

• There will be no other form of inter-process communication allowed: no direct linking, no direct reads of another team’s data store, no shared-memory model, no back-doors whatsoever. The only communication allowed is via service interface calls over the network

• It doesn’t matter what technology they use

• All service interfaces, without exception, must be designed from the ground up to be externalisable. That is to say, the team must plan and design to be able to expose the interface to developers in the outside world. No exceptions.

And it ended with this line:

“Anyone who doesn’t do this will be fired. Thank you; have a nice day!”

While most businesses were still transferring data between teams using spreadsheets and email, Bezos was building a decoupled and dynamic network of services, gaining the skills and knowledge required to build the Amazon Web Services platform. That business brought in $6.11bn in revenue for Amazon in Q2 2018.

Look at the data your teams are generating and using. The chances are that there’s additional value lying hidden in that data, just waiting to be unlocked.