Below is the full project FAQ that we submitted to the Knight News Challenge earlier this year!
1. What is Spending Stories?
Spending Stories is where the news meets the facts. We want to become the place where news stories are given the verification they need in a data-driven world, while extending our spending data store with the narrative contextualisation necessary to go beyond list views and colored bubbles.
We will do this by (i) building a powerful tool to connect news stories to spending data and vice versa, and (ii) building a vibrant community of spending experts around the world – with an initial focus on Germany, the UK and a few other countries.
2. What do you hope to achieve?
Ultimately our aim is to dramatically improve spending data literacy amongst journalists and the general public. We want to do this by creating a compelling set of tools for understanding, contextualising and interrogating government spending data – and a vibrant community of users around these tools.
We want to speed up the fact-finding and fact-checking process to allow journalists to release news articles more quickly, and to raise journalistic standards by discouraging the poor or misleading use of data, such as bogus comparisons. We aim to make Spending Stories into an invaluable tool that people keep returning to, and one of the most useful, interesting and detailed sources of information about public spending on the web.
3. Why spending? What about other areas like health, crime or environment?
It’s an old saying that to get to the bottom of a story, one should “follow the money”. The same is true of government: spending is where policies and priorities are broken down into figures. Spending has a direct influence on all political areas: while other data on health or social help us understand what challenges society faces, spending allows us to see how government reacts to all of these.
Spending data is also a perfect basis for large scale, community driven matching between datasets and news stories. There is lots of spending data available online, but often this is buried away on government websites, or in non-machine readable forms. It is also of widespread interest, as taxpayers are often interested in how their money is disbursed.
Building open-source tools and stronger communities to connect stories with spending data is also a strong basis for doing this with other types of datasets – which is something we are definitely interested in doing in the medium term.
4. Who is the project aimed at? Is it for everyone or just for data geeks?
Spending Stories is not just for data geeks! The project is aimed at anyone and everyone who has looked at a news story about public spending and thought ‘But how much is that really?’ or ‘I wonder how that compares to X?’ or ‘I wonder how that actually breaks down?’.
This includes everyone from citizens trying to understand how they will be affected by funding cuts, to think tanks or advocacy groups trying to put together evidence-based policy proposals, to bloggers writing about who gets what from public funding sources. It will provide journalists with a powerful and intuitive resource to help them do more accurate, insightful and interesting reportage that is driven by and grounded in spending data sources.
5. Why aren’t news stories about public spending enough?
Every day there are fresh headlines about the public purse. We are bombarded with stories about cutbacks, bailouts, deficits, and subsidies. Spending figures permeate reportage about everything from hospital bed numbers to green energy programmes, foreign aid to local transport schemes. Without context it is difficult for ordinary readers to understand what the big numbers mean – and to know whether or not to trust those numbers.
For example we learn that X million was spent on Y, or cutbacks of spending increases of Z million are being planned. Without knowing how this compares to the past or to other areas, it is difficult to evaluate whether the numbers under consideration are a lot or a little.
Spending Stories will enable users to see the numbers behind the headlines, to explore the underlying datasets using a variety of intuitive tools, to see related stories and to contribute their own insights or views.
6. Why aren’t government spending documents and datasets enough?
Public bodies are releasing an unprecedented amount of raw data on public spending. Recent commitments to financial transparency in countries such as the US and the UK enable the public to explore where their tax money is spent in unprecedented detail. However, at the point of publication, datasets may be fragmentary, disconnected and lacking in context.
We have done extensive work to extract, clean up, connect and expose spending data as part of our work on WhereDoesMyMoneyGo.org and OpenSpending.org. As a result of these projects the public now have well connected, machine-readable datasets, which all are free to explore and reuse.
Nevertheless, users of these sites and other transparency websites may get lost in the data without familiar narratives to make the budget lines meaningful. Spending Stories will give help to bring context, familiarity and meaning to public spending sources.
7. How will it work?
Spending Stories will have four main parts:
- Story Aggregator – which will gather news stories, blog posts and other spending related content using a mixture of automated tools and user input.
- Matching Tool – which will enable users to match current and historical news stories to datasets in a variety of different ways.
- Behind the News – an expert blog giving analysis and context to news articles about public spending via brief posts and micro-short videos (much like Hans Rosling’s short videos as part of the Gapminder project).
- Spending Stories Browser Plugin – which will ambiently suggest spending datasets relevant to pages people are browsing.
Users will be provided with extensive means to perform custom searches, giving them the means to find the spending records that are relevant or interesting to them. Visualization tools will be provided to compose custom presentations of the geographic, temporal and topical distribution of funding in various forms – from simple bar charts to complex, multi-level displays of aggregated spending.
Annotation and classification functions will enable both personal and community-based recording of additional knowledge, deriving narratives from data in an incremental way. Annotations might include information about related policies and actors. We will help to coordinate the work of journalists, volunteers and others going through data – especially with freshly scraped or aggregated data, or new requests or releases.
8. Where will the stories come from?
Stories will be aggregated from a variety of sources – including media organisations, blogs, press releases and other online documents. The Open Knowledge Foundation is already in contact with media organisations such as the BBC, the Financial Times, the Guardian, the International Herald Tribune, La Stampa, Le Monde, the New York Times, the Telegraph, and the Zeit Online. We also aim to build on numerous other projects which provide aggregation and bulk analysis of blog and news material – such as SYNC3, which matches blog posts to related news stories. This will be combined with links harvested automatically on the basis of key words and phrases, and stories and RSS feeds suggested by users. Additionally our spending research team will proactively scour the web for interesting anecdotes, stories, comments and sources.
9. Where will the data come from?
The Open Knowledge Foundation has excellent access to local, regional, national and international spending datasets from around the world. It helps to run over 30 official and unofficial government data catalogues – including data.gov.uk, data.norge.no, data.overheid.nl – which contain numerous key spending datasets. It has aggregated tens of thousands of datasets from around the world in projects like OpenDataSearch.org and PublicData.eu projects.
The Foundation has worked hard to expose spending data in the UK, where it was in contact with key people in the government about opening up all spending data over £25k, and the COINs dataset, which is one of the most detailed sources of spending information. We will harness existing projects like ScraperWiki to derive structured data from unstructured sources, and build on existing work to map spending datasets like the Open Budget Index. We will also allow users to directly upload datasets.
10. How will the project relate to and engage with existing organisations and initiatives?
In addition to working alongside media organisations and spending data providers (see 5. and 6.) we will work closely with the Foundation’s growing network of stakeholders interested in official data, which includes journalists, civic developers, FOI advocates, academics, public sector accountants, independent research organisations (like the UK’s Institute for Fiscal Studies), journalist networks (such as the European Journalism Centre) and many others from around the world. We are very keen to work with others who are interested in this area – and generally the Foundation is a very collaborative organisation with a vibrant community around it.
In particular Spending Stories will build on other Open Knowledge Foundation projects such as OpenSpending.org, which stores the data for WhereDoesMyMoneyGo.org and OffenerHaushalt.de.
Tools on the site would be able to be embedded in hyperlocal news websites, blogs and in discussions taking place on social networking services such as Facebook or Twitter.
11. What will the geographical focus be?
Initially Spending Stories will focus on Germany, the United Kingdom and possibly one more country, with a view to gradually expanding. The software will be open source and freely available, and we will work with a range of international partners to start versions of the project in different geographical areas. With OpenSpending we are working with organisations, developers, journalists and ordinary citizens interested in spending data in over 20 countries including Argentina, Albania, Austria, Azerbaijan, Canada, Croatia, France, Georgia, Germany, Greece, Hungary, Israel, Italy, Kosovo, Latvia, Mexico, Netherlands, Norway, Romania, Slovakia, Spain, the United Kingdom, the United States and Uruguay. We also have experience working with data on international development funding flows and EU spending.
12. How will Spending Stories be useful for journalists?
We hope that Spending Stories will help journalists:
- find material that may trigger them to write new stories,
- help them to research or fact-check stories they are writing
- provide supporting material for stories they are writing
- see what others have written or said about areas they are interested in
- get a better long term perspective of spending trends
- have a bigger, more comparative picture of spending in different regions
- do more investigative research that would otherwise be a lot more resource intensive or time consuming
- systematically track transactions between certain entities
- receive alerts related to spending areas they are interested in – e.g related to new stories, comments or datasets
13. How will Spending Stories help me to understand spending in my area?
Users of the site will be able to be notified of news stories and spending datasets pertaining to their geographic region or relevant to their interests. For example, after entering their postcode, a user might be notified of news stories about spending cuts to their local hospital, and would be able to directly browse and explore the data behind these stories from their local authority. Someone browsing the site could navigate to their geographical region and see top stories related to, e.g. a proposed new cultural heritage facility or changes in taxes or educational fees.
Furthermore, users would have the opportunity to comment on budget lines, engage with other users who are interested in stories and datasets in their region, and ask questions about what the numbers mean to a variety of local and national experts. They would be able to fact-check local news stories, verify that the numbers reported are portrayed accurately, and examine local spending figures with meaningful context (such as top-down spending comparisons with other regions or previous financial years, seeing how similar line items compare in different geographical regions, and so on).
14. What kinds of questions will Spending Stories help me answer?
Spending Stories will help users answer questions like:
- How big is that reported case of spending waste compared to the overall budget of my council?
- What other contracts did the company that got caught with bribery get over the past months?
- How much has investment in UK schools grown over the past 5 years?
- Which transactions or spending datasets were most widely reported on in this year?
- Did my county get less investment from the latest stimulus than others?
- What kinds of spending topics were most popular in 2010?
- How much do streetlights cost in Rome vs Paris?
15. Will Spending Stories explain to people how government finance works?
Government finance is an extremely complex matter and even a powerful tool such as Spending Stories cannot provide a simplistic explanation of all the processes involved in the collection and disbursement of public funds. However the combination of more readily available and user-friendly data, in combination with the narrative provided by related news stories and the expert blog, will allow users to deepen their understanding of how public spending decisions are taken, and to use this knowledge constructively.
16. How will you encourage others to build on Spending Stories?
In addition to directly providing web applications and services on the basis of public spending datasets and user-contributed data, we would like to encourage others to reuse and innovate with the material we expose. In particular we recognise that news organisations may wish to produce their own visualisations or web applications based on the material from Spending Stories or the data in OpenSpending. We would like to maximise reuse of our material – e.g. in mobile applications, or integration with other existing services.
Hence all our datasets will be openly licensed and all our code will be open source. We will also actively support those wishing to reuse our code and data with advice. We have a number of public mailing lists to ensure that advice, experience and expertise is shared as widely as possible.