What is data journalism? I could answer, simply, that it is journalism done with data. But that doesn’t help much.
Both ‘data’ and ‘journalism’ are troublesome terms. Some people think of ‘data’ as any collection of numbers, most likely gathered on a spreadsheet. 20 years ago, that was pretty much the only sort of data that journalists dealt with. But we live in a digital world now, a world in which almost anything can be — and almost everything is — described with numbers.
Your career history, 300,000 confidential documents, who knows who in your circle of friends can all be (and are) described with just two numbers: zeroes, and ones. Photos, video and audio are all described with the same two numbers: zeroes and ones. Murders, disease, political votes, corruption and lies: zeroes and ones.
What makes data journalism different to the rest of journalism? Perhaps it is the new possibilities that open up when you combine the traditional ‘nose for news’ and ability to tell a compelling story, with the sheer scale and range of digital information now available.
And those possibilities can come at any stage of the journalist’s process: using programming to automate the process of gathering and combining information from local government, police, and other civic sources, as Adrian Holovaty did with ChicagoCrime and then EveryBlock.
Or using software to find connections between hundreds of thousands of documents, as The Telegraph did with MPs' expenses.
Data journalism can help a journalist tell a complex story through engaging infographics. Hans Rosling’s spectacular talks on visualizing world poverty with Gapminder, for example, have attracted millions of views across the world. And David McCandless’s popular work in distilling big numbers — such as putting public spending into context, or the pollution generated and prevented by the Icelandic volcano — shows the importance of clear design at Information is Beautiful.
Or it can help explain how a story relates to an individual, as the BBC and the Financial Times now routinely do with their budget interactives (where you can find out how the budget affects you, rather than ‘Joe Public’). And it can open up the news gathering process itself, as The Guardian do so successfully in sharing data, context, and questions with their Datablog.
Data can be the source of data journalism, or it can be the tool with which the story is told — or it can be both. Like any source, it should be treated with scepticism; and like any tool, we should be conscious of how it can shape and restrict the stories that are created with it.
- Paul Bradshaw, Birmingham City University, BBC