You are browsing the archive for Jonathan Gray.

Panton Fellowships

Jonathan Gray - July 7, 2011 in Uncategorized

For a while I’ve been thinking about how the Open Knowledge Foundation can do more to support open data in science. In particular how we can do more to encourage research funders and publicly funded research bodies to adopt open data policies and mandates.

With this in mind, I floated the idea of setting up Panton Fellowships for Open Data in Science to Cameron Neylon (STFC), Peter Murray-Rust (University of Cambridge), Tim Hubbard (Sanger Institute) and Karien Bezuidenhout (Shuttleworth Foundation) at OKCon 2011 in Berlin last week. Since then I’ve spoken to Melissa Hagemann at OSI, Peter has mentioned this to colleagues at JISC, and Tim has mentioned this to colleagues at the Wellcome Trust.

Here’s a brief sketch of what shape they might take:


The Panton Principles for Open Data in Science strongly encourage scientists and others to place scientific research data into the public domain using an appropriate license or legal tool.

There is currently a window of opportunity to encourage more research funders, publishers, institutions, and societies to adopt these principles in relation to scientific research data that they fund or publish. Policies and norms in this area are still being determined – and it is much easier to encourage best practises from the outset than to change them once they are fixed.

Leading advocates in this area often have ideas and contacts, but lack time to do as much as they’d like. Many graduates or early career stage researchers are interested in this area and may have more time, but may lack contacts, guidance and a sense of strategic priority.


The Fellowships would focus on:

  • communicating and encouraging more stakeholders to adopt the Panton Principles
  • understanding and overcoming obstacles to opening up scientific data in different fields
  • identifying opportunities for opening up scientific data in new fields and engaging new stakeholders

They would be targeted at graduate level and early career stage scientific researchers, co-funded by a loose coalition of funding bodies, and located at leading partner institutions.

There would be 2-4 Panton Fellowships per year. Panton Fellowships would last between 9-12 months. Panton Fellows would receive a stipend equivalent to the level of financial support received by PhD or post-doctoral students.

How can we support open data in Canada?

Jonathan Gray - July 7, 2011 in Uncategorized

Yesterday Kat, Rufus and I caught up with Jonathan Brun, David Eaves, Tracey Lauriault, James McKinney about how the Open Knowledge Foundation might be able to help:

  • To promote open knowledge in Canada,
  • To strengthen/expand the open knowledge community in Canada – in particular the increasingly active open data community
  • To support the development of specific OKF open data projects and initiatives in Canada

We discussed the possibility of establishing a Canadian chapter of the OKF to help with some of these things.

We concluded:

  • There is lots of amazing stuff happening in Canada already, lots of this occurs without need for a more formal organisation – which could duplicate existing work or even have a negative effect on existing efforts. Hence if it ain’t broke, don’t fix it!
  • There is some work to be done in helping to connect different initiatives. In the medium term it could be useful to have someone dedicated to bringing different people, organisations and projects onto the same page. Someone suggested having a basic site to help link to existing stuff (Kat said she’d help with this). We also agreed to put together a role description for someone who could help to connect the dots between existing projects/people/groups.
  • David Eaves and others will help to promote and encourage the development of the project in Canada.
  • We we will not set up an okfn-ca list for now – as there are already lists which are used to discuss open data stuff. For example, the civicaccess-discuss which OKF people should join, contribute to, and point people towards!
  • We asked about who we should invite from Canadian projects/organisations to Open Government Data Camp 2011 in Warsaw. We can catch up further then!

While we mainly discussed open data (open government data in particular) we’d also like to help build an open knowledge commons that includes cultural works (e.g. public domain works), scientific research, and so on.

We agreed to keep an eye on developments and catch up again in a few weeks.

Open data in itself is not enough

Jonathan Gray - July 6, 2011 in Uncategorized

Just posted a comment on an interesting article by Michael Gustein, following on from his talk at OKCon 2011:

For what its worth I (and many others at the OKF) fully agree that the legal/technical openness of information is not in itself sufficient for value to be derived from this information.

There are all kinds of other factors and ingredients involved here. For example, access and ability to use ICTs, basic data literacy (which often even the most technically literate computer users may not possess), prerequisite contextual knowledge to interpret official documents and datasets (e.g. when the UK government released fine grained spending information, several journalists published articles saying it was ‘secrecy via transparency’ as the data was so hard to make sense of), etc.

The process of deriving value from information is not straightforward – and I don’t think there are any easy answers. I also don’t think that responsibility for catalysing/supporting the process of deriving value from information lies solely with the ‘open data movement’. It probably lies with society (e.g. media, NGOs, you and me) and with the state (e.g. via the education system, state funded data literacy initiatives, etc).

But I do think that making sure we all have realistic expectations about what open data does and doesn’t do, and who is in a position to benefit from it, and what we have to do to enable more people to benefit from it, is probably a Good Thing. Hence my inviting Michael to OKCon 2011 to kick off discussions – which he seems to have succeeded in doing – both offline at the conference, and online here!

For several years I’ve wanted to write an article called something like “Open Data is not a Panacea”. I discussed this more recently with Rufus Pollock. Perhaps now we have a good excuse to do that!

Worklog 2011-06-27 to 2011-07-02

Jonathan Gray - July 4, 2011 in Uncategorized

Some things I did this week:

Open Knowledge Foundation + EU ‘Open Cities’ project?

Jonathan Gray - June 27, 2011 in Uncategorized

Earlier I caught up for a quick call with Esteve Almirall of the EU funded Open Cities project.

We discussed:

  • Collaborating regarding work on data catalogues. In particular organising a face to face meeting with teams from CKAN + Fraunhofer + Universitat Pompeu Fabra some time later this year. Will try to catch up at OKCon.
  • Collaborating on (another) pan-European Open Data Challenge in Feb-June 2012 (I suggested this is more than just a competition! E.g. with stipends or other longer term ways to support the applications and encourage cross-fertilisation between cities/member states…)
  • Collaborating on Open Government Data Camp 2011 in October in Warsaw
  • Open data licensing policies – including a working group on licensing/policy at Open Cities project
  • Having the OKF as an affiliated partner in the Open Cities project
  • Getting Open Cities a slot at OKCon

Flyer for Open Government Data Camp 2011

Jonathan Gray - June 27, 2011 in Uncategorized

Just created a flyer for Open Government Data Camp 2011 to distribute at OKCon 2011.

You can find it on the OKF’s Flickr account (JPG) or on Scribd (PDF).

Postcards for the Public Domain Review

Jonathan Gray - June 24, 2011 in Uncategorized

Some postcards for the Public Domain Review put together by Adam, Kat and I:

Worklog 2011-06-20 to 2011-06-24

Jonathan Gray - June 24, 2011 in Uncategorized

Some things I did this week:

FAQ for Spending Stories

Jonathan Gray - June 22, 2011 in Uncategorized

Below is the full project FAQ that we submitted to the Knight News Challenge earlier this year!

1. What is Spending Stories?

Spending Stories is where the news meets the facts. We want to become the place where news stories are given the verification they need in a data-driven world, while extending our spending data store with the narrative contextualisation necessary to go beyond list views and colored bubbles.

We will do this by (i) building a powerful tool to connect news stories to spending data and vice versa, and (ii) building a vibrant community of spending experts around the world – with an initial focus on Germany, the UK and a few other countries.

2. What do you hope to achieve?

Ultimately our aim is to dramatically improve spending data literacy amongst journalists and the general public. We want to do this by creating a compelling set of tools for understanding, contextualising and interrogating government spending data – and a vibrant community of users around these tools.

We want to speed up the fact-finding and fact-checking process to allow journalists to release news articles more quickly, and to raise journalistic standards by discouraging the poor or misleading use of data, such as bogus comparisons. We aim to make Spending Stories into an invaluable tool that people keep returning to, and one of the most useful, interesting and detailed sources of information about public spending on the web.

3. Why spending? What about other areas like health, crime or environment?

It’s an old saying that to get to the bottom of a story, one should “follow the money”. The same is true of government: spending is where policies and priorities are broken down into figures. Spending has a direct influence on all political areas: while other data on health or social help us understand what challenges society faces, spending allows us to see how government reacts to all of these.

Spending data is also a perfect basis for large scale, community driven matching between datasets and news stories. There is lots of spending data available online, but often this is buried away on government websites, or in non-machine readable forms. It is also of widespread interest, as taxpayers are often interested in how their money is disbursed.

Building open-source tools and stronger communities to connect stories with spending data is also a strong basis for doing this with other types of datasets – which is something we are definitely interested in doing in the medium term.

4. Who is the project aimed at? Is it for everyone or just for data geeks?

Spending Stories is not just for data geeks! The project is aimed at anyone and everyone who has looked at a news story about public spending and thought ‘But how much is that really?’ or ‘I wonder how that compares to X?’ or ‘I wonder how that actually breaks down?’.

This includes everyone from citizens trying to understand how they will be affected by funding cuts, to think tanks or advocacy groups trying to put together evidence-based policy proposals, to bloggers writing about who gets what from public funding sources. It will provide journalists with a powerful and intuitive resource to help them do more accurate, insightful and interesting reportage that is driven by and grounded in spending data sources.

5. Why aren’t news stories about public spending enough?

Every day there are fresh headlines about the public purse. We are bombarded with stories about cutbacks, bailouts, deficits, and subsidies. Spending figures permeate reportage about everything from hospital bed numbers to green energy programmes, foreign aid to local transport schemes. Without context it is difficult for ordinary readers to understand what the big numbers mean – and to know whether or not to trust those numbers.

For example we learn that X million was spent on Y, or cutbacks of spending increases of Z million are being planned. Without knowing how this compares to the past or to other areas, it is difficult to evaluate whether the numbers under consideration are a lot or a little.

Spending Stories will enable users to see the numbers behind the headlines, to explore the underlying datasets using a variety of intuitive tools, to see related stories and to contribute their own insights or views.

6. Why aren’t government spending documents and datasets enough?

Public bodies are releasing an unprecedented amount of raw data on public spending. Recent commitments to financial transparency in countries such as the US and the UK enable the public to explore where their tax money is spent in unprecedented detail. However, at the point of publication, datasets may be fragmentary, disconnected and lacking in context.

We have done extensive work to extract, clean up, connect and expose spending data as part of our work on and As a result of these projects the public now have well connected, machine-readable datasets, which all are free to explore and reuse.

Nevertheless, users of these sites and other transparency websites may get lost in the data without familiar narratives to make the budget lines meaningful. Spending Stories will give help to bring context, familiarity and meaning to public spending sources.

7. How will it work?

Spending Stories will have four main parts:

  • Story Aggregator – which will gather news stories, blog posts and other spending related content using a mixture of automated tools and user input.
  • Matching Tool – which will enable users to match current and historical news stories to datasets in a variety of different ways.
  • Behind the News – an expert blog giving analysis and context to news articles about public spending via brief posts and micro-short videos (much like Hans Rosling’s short videos as part of the Gapminder project).
  • Spending Stories Browser Plugin – which will ambiently suggest spending datasets relevant to pages people are browsing.

Users will be provided with extensive means to perform custom searches, giving them the means to find the spending records that are relevant or interesting to them. Visualization tools will be provided to compose custom presentations of the geographic, temporal and topical distribution of funding in various forms – from simple bar charts to complex, multi-level displays of aggregated spending.

Annotation and classification functions will enable both personal and community-based recording of additional knowledge, deriving narratives from data in an incremental way. Annotations might include information about related policies and actors. We will help to coordinate the work of journalists, volunteers and others going through data – especially with freshly scraped or aggregated data, or new requests or releases.

8. Where will the stories come from?

Stories will be aggregated from a variety of sources – including media organisations, blogs, press releases and other online documents. The Open Knowledge Foundation is already in contact with media organisations such as the BBC, the Financial Times, the Guardian, the International Herald Tribune, La Stampa, Le Monde, the New York Times, the Telegraph, and the Zeit Online. We also aim to build on numerous other projects which provide aggregation and bulk analysis of blog and news material – such as SYNC3, which matches blog posts to related news stories. This will be combined with links harvested automatically on the basis of key words and phrases, and stories and RSS feeds suggested by users. Additionally our spending research team will proactively scour the web for interesting anecdotes, stories, comments and sources.

9. Where will the data come from?

The Open Knowledge Foundation has excellent access to local, regional, national and international spending datasets from around the world. It helps to run over 30 official and unofficial government data catalogues – including,, – which contain numerous key spending datasets. It has aggregated tens of thousands of datasets from around the world in projects like and projects.

The Foundation has worked hard to expose spending data in the UK, where it was in contact with key people in the government about opening up all spending data over £25k, and the COINs dataset, which is one of the most detailed sources of spending information. We will harness existing projects like ScraperWiki to derive structured data from unstructured sources, and build on existing work to map spending datasets like the Open Budget Index. We will also allow users to directly upload datasets.

10. How will the project relate to and engage with existing organisations and initiatives?

In addition to working alongside media organisations and spending data providers (see 5. and 6.) we will work closely with the Foundation’s growing network of stakeholders interested in official data, which includes journalists, civic developers, FOI advocates, academics, public sector accountants, independent research organisations (like the UK’s Institute for Fiscal Studies), journalist networks (such as the European Journalism Centre) and many others from around the world. We are very keen to work with others who are interested in this area – and generally the Foundation is a very collaborative organisation with a vibrant community around it.

In particular Spending Stories will build on other Open Knowledge Foundation projects such as, which stores the data for and

Tools on the site would be able to be embedded in hyperlocal news websites, blogs and in discussions taking place on social networking services such as Facebook or Twitter.

11. What will the geographical focus be?

Initially Spending Stories will focus on Germany, the United Kingdom and possibly one more country, with a view to gradually expanding. The software will be open source and freely available, and we will work with a range of international partners to start versions of the project in different geographical areas. With OpenSpending we are working with organisations, developers, journalists and ordinary citizens interested in spending data in over 20 countries including Argentina, Albania, Austria, Azerbaijan, Canada, Croatia, France, Georgia, Germany, Greece, Hungary, Israel, Italy, Kosovo, Latvia, Mexico, Netherlands, Norway, Romania, Slovakia, Spain, the United Kingdom, the United States and Uruguay. We also have experience working with data on international development funding flows and EU spending.

12. How will Spending Stories be useful for journalists?

We hope that Spending Stories will help journalists:

  • find material that may trigger them to write new stories,
  • help them to research or fact-check stories they are writing
  • provide supporting material for stories they are writing
  • see what others have written or said about areas they are interested in
  • get a better long term perspective of spending trends
  • have a bigger, more comparative picture of spending in different regions
  • do more investigative research that would otherwise be a lot more resource intensive or time consuming
  • systematically track transactions between certain entities
  • receive alerts related to spending areas they are interested in – e.g related to new stories, comments or datasets

13. How will Spending Stories help me to understand spending in my area?

Users of the site will be able to be notified of news stories and spending datasets pertaining to their geographic region or relevant to their interests. For example, after entering their postcode, a user might be notified of news stories about spending cuts to their local hospital, and would be able to directly browse and explore the data behind these stories from their local authority. Someone browsing the site could navigate to their geographical region and see top stories related to, e.g. a proposed new cultural heritage facility or changes in taxes or educational fees.

Furthermore, users would have the opportunity to comment on budget lines, engage with other users who are interested in stories and datasets in their region, and ask questions about what the numbers mean to a variety of local and national experts. They would be able to fact-check local news stories, verify that the numbers reported are portrayed accurately, and examine local spending figures with meaningful context (such as top-down spending comparisons with other regions or previous financial years, seeing how similar line items compare in different geographical regions, and so on).

14. What kinds of questions will Spending Stories help me answer?

Spending Stories will help users answer questions like:

  • How big is that reported case of spending waste compared to the overall budget of my council?
  • What other contracts did the company that got caught with bribery get over the past months?
  • How much has investment in UK schools grown over the past 5 years?
  • Which transactions or spending datasets were most widely reported on in this year?
  • Did my county get less investment from the latest stimulus than others?
  • What kinds of spending topics were most popular in 2010?
  • How much do streetlights cost in Rome vs Paris?

15. Will Spending Stories explain to people how government finance works?

Government finance is an extremely complex matter and even a powerful tool such as Spending Stories cannot provide a simplistic explanation of all the processes involved in the collection and disbursement of public funds. However the combination of more readily available and user-friendly data, in combination with the narrative provided by related news stories and the expert blog, will allow users to deepen their understanding of how public spending decisions are taken, and to use this knowledge constructively.

16. How will you encourage others to build on Spending Stories?

In addition to directly providing web applications and services on the basis of public spending datasets and user-contributed data, we would like to encourage others to reuse and innovate with the material we expose. In particular we recognise that news organisations may wish to produce their own visualisations or web applications based on the material from Spending Stories or the data in OpenSpending. We would like to maximise reuse of our material – e.g. in mobile applications, or integration with other existing services.

Hence all our datasets will be openly licensed and all our code will be open source. We will also actively support those wishing to reuse our code and data with advice. We have a number of public mailing lists to ensure that advice, experience and expertise is shared as widely as possible.

Shall we set up

Jonathan Gray - June 21, 2011 in Uncategorized

The heroic Richard Pope just popped me an email about a new mini-project he’s been working on to “keep meetings short and force you to take notes”, inspired by Google’s internal conferencing habits and the Open Knowledge Foundation’s collaborative meeting minuting.

From the blurb:

How it works:

  • Asks you for a title, purpose and length of upfront
  • Starts a countdown timer
  • Displays a split screen half countdown / half etherpad for taking notes
  • When the timer hits zero the screen flashes (in the future it might also force you to stop typing by threatening to delete your notes)
  • Notes and meeting details are saved for future use

Should we set this up for the OKF’s meetings? E.g. at Here’s the code. And here are some screen shots: