Data Journalism Handbook 1.0 BETA

Harnessing External Expertise Through Hackthons

Figure 22. Hackathons: how to boost collaboration between journalists and developers (photo by Heinze Havinga)

In March 2010, Utrecht based digital culture organzation SETUP put on an event called ‘Hacking Journalism’. The event was organised to encourage greater collaboration between developers and journalists.

‘We organize hackathons to make cool applications, but we can’t recognise interesting stories in data. What we build has no social relevance’, said the programmers. ‘We recognize the importance of data journalism, but we don’t have all the technical skills to build the things we want’, said the journalists.

Working for a regional newspaper, there was no money or incentive to hire a programmer for the newsroom. Data journalism was still an unknown quantity for Dutch newspapers at that time.

The hackathon model was perfect. A relaxed environment for collaboration, with plenty of pizza and energy drinks. RegioHack was a hackathon organised by my employer, the regional newspaper De Stentor, our sister publication TC Tubantia and Saxion Hogescholen Enschede, who provided the location for the event.

The setup was as following: everyone could enlist for a 30-hour hackathon. We provided the food and drink. We aimed for 30 participants, which we divided into 6 groups. These groups would focus on different topics, such as crime, health, transport, safety, ageing and power. For us, the three main objectives for this event were as follows:

Find stories

For us, data journalism is something new and unknown. The only way we can prove its use, is through well crafted stories. We planned to produce at least three data stories.

Connect people

We, the journalists, don’t know how data journalism is done and we don’t pretend to. By putting journalists, students and programmers in one room for 30 hours, we want them to share knowledge and insights.

Host a social event

Newspapers don’t organise a lot of social events, let alone hackathons. We wanted to experience how such an event can yield results. In fact, the event could have been tense: 30 hours with strangers, lots of jargon, bashing your head against basic questions, working out of your comfort zone. By making it a social event — remember the pizza and energy drink? — we wanted to create an environment in which journalists and programmers could feel comfortable and collaborate effectively.

Before the event, TC Tubantia had an interview with the widow of a policeman who had written a book on her husband’s working years. She also had a document with all registered murders in the eastern part of the Netherlands, maintained by her husband since 1945. Normally, we would publish this document on our website. This time, we made a dashboard using the Tableau software. We also blogged about how this came together on our RegioHack site.

During the hackathon, one project group came up with the subject of development of schools and the ageing of our region. By making a visualization of future projections, we understood which cities would get in trouble after a few years of decline in enrolments. With this insight, we made an article on how this would affect schools in our region.

We also started a very ambitious project, called De Tweehonderd van Twente (in English, The Two Hundred of Twente) to determine who had the most power in our region and build a database of the most influential people. Through a Google-ish calculation — who has the most ties with powerful organizations — a list of influential people will be composed. This could lead to a series of articles, but it’s also a powerful tool for journalists. Who has connections with who? You can ask questions to this database and use it in our daily routine. Also, this database has cultural value. Artists already asked if they could use this database when finished to make interactive art installations.

Figure 23. New communities around data journalism (photo by Heinze Havinga)

After RegioHack, we noticed that journalists considered data journalism as a viable addition to traditional journalism. My colleagues continued to use and build on the techniques learned on the day to create more ambitious and technical projects such as a database of the administrative costs of housing. With this data, I made an interactive map in Fusion Tables. We asked our readers to play around with the data and crowdsourced results (here, for example). After a lot of questions on how we made a map in Fusion Tables, I also recorded a video tutorial.

What did we learn? We learned a lot, but we also came along a lot of obstacles. We recognized these four:

Where to begin: question or data?

Almost all projects stalled when searching for information. Most of the time, they began with a journalistic question. But then? What data is available? Where can you find it? And when you find this data, can you answer your question with it. Journalists usually know where they can find information when doing research for an article. With data journalism, most journalists don’t know what information is available.

Little technical knowledge

Data journalism is quite a technical discipline. Sometimes you have to scrape, other times you’ll have to do some programming to visualize your results. For excellent data journalism, you’ll need two aspects: the journalistic insight of an experienced journalist and the technical know-how of a digital all-rounder. During RegioHack, this was not a common presence.

Is it news?

Participants mostly used one dataset to discover news, instead of searching interconnections between different sources. The reason for this: you need some statistical knowledge to to verify news from data journalism.

What’s the routine?

What above all comes down to, is that there’s no routine. The participants have some skills under their belt, but don’t know how and when to use them. One journalist compared it with baking a cake. ‘We have all the ingredients: flour, eggs, milk, etcetera. Now we throw it all in a bag, shake it and hope a cake comes out of it.' Indeed, we have all the ingredients, but don’t know what the recipe is.

What now? Our first experiences with data journalism could help other journalists or programmers aspiring the same field of work and we are working to produce a report.

Also, we are considering how to continue RegioHack in a hackathon form. We found it fun, educational and productive and a great introduction to data journalism.

But for data journalism to work, we have to integrate it in the newsroom. Journalists have to think in data, in addition to quotes, press releases, council meetings and so on. By doing RegioHack, we proved to our audience that data journalism isn’t just hype. We can write better informed and more distinctive articles, while presenting our readers different articles in print and online.

Jerry Vermanen,