The Open Data Index


The Open Knowledge Foundation launched the Open Data Index, so you can see what data countries provide to their citizens.

An increasing number of governments have committed to open up data, but how much key information is actually being released? Is the available data legally and technically usable so that citizens, civil society and businesses can realise the full benefits of the information? Which countries are the most advanced and which are lagging in relation to open data? The Open Data Index has been developed to help answer such questions by collecting and presenting information on the state of open data around the world – to ignite discussions between citizens and governments.

The system is based on community editor contributions and the index assesses the availability of datasets such as transportation timetables, election results, and legislation, and provides a single-number score. The score is based on the criteria that are shown in table 1 below. The higher the score is, the more data a government makes available to the public. Of the 70 participating countries, the UK leads the way, followed by the United States and Denmark. Germany is on place 37 with 410 point barely above Mexico and Greece.

Table 1: Assessment criteria

Question Details Weighting
Does the data exist? Does the data exist at all? The data can be in any form (paper or digital, offline or online etc). If it is not, then all the other questions are not answered. 5
Is data in digital form? This question addresses whether the data is in digital form (stored on computers or digital storage) or if it only in, for example, paper form. 5
Publicly available? This question addresses whether the data is “public”. This does not require it to be freely available, but does require that someone outside of the government can access it in some form (examples include if the data is available for purchase, if it exist as PDFs on a website that you can access, if you can get it in paper form – then it is public). If a freedom of information request or similar is needed to access the data, it is not considered public. 5
Is the data available for free? This question addresses whether the data is available for free or if there is a charge. If there is a charge, then that is stated in the comments section. 15
Is the data available online? This question addresses whether the data is available online from an official source. In the cases that this is answered with a ‘yes’, then the link is put in the URL field below. 5
Is the data machine readable? Data is machine readable if it is in a format that can be easily processed by a computer. Data can be digital but not machine readable. For example, consider a PDF document containing tables of data. These are definitely digital but are not machine-readable because a computer would struggle to access the tabular information (even though they are very human readable!). The equivalent tables in a format such as a spreadsheet would be machine readable. Note: The appropriate machine readable format may vary by type of data – so, for example, machine readable formats for geographic data may be different than for tabular data. In general, HTML and PDF are not machine-readable. 15
Available in bulk? Data is available in bulk if the whole dataset can be downloaded or accessed easily. Conversely it is considered non-bulk if the citizens are limited to just getting parts of the dataset (for example, if restricted to querying a web form and retrieving a few results at a time from a very large database). 10
Openly licensed? This question addresses whether the dataset is open as per It needs to state the terms of use or license that allow anyone to freely use, reuse or redistribute the data (subject at most to attribution or sharealike requirements). It is vital that a licence is available (if there’s no licence, the data is not openly licensed). Open Licences which meet the requirements of the Open Definition are listed at 30
Is the data provided on a timely and up-to-date basis? This question addresses whether the data is up-to-date and timely – or long delayed. For example, is election data made available immediately or soon after the election, or is it only available many years later? Any comments around uncertainty are put in the comments field. 10
URL of data online? The link to the specific dataset if that is possible. Otherwise to the home page for the data. If that is not possible, then the link to main page of site on which the data is located. Only links to official sites are eligible, not third party sites. When it is necessary for submitters to provide third party links, then they are put in the comments section.
Date the data became available? This question describes when the data first became openly available (online, in digital form, openly licensed etc). Sometimes this is approximate. For example, “2012” or “Jan 2012”. If there is a precise date, then they are typed in in a yyyy-mm-dd format.
If the data is not open, then this question will instead describe the date the data first became available at all. (Note: some open data will have been available in other forms previously, so the date specified here is the date it became openly available).
Format of data? This question describes the form that the data is available in. For example, for tabular data it might be: Excel, CSV, HTML or even PDF. For geodata it might be shapefiles, geojson or something else. If available in multiple formats, the format descriptors are listed separated with commas. Any further information is put in the comments section.



Leave a Reply