“Comment is free, but facts are sacred”. So wrote CP Scott in his ‘A Hundred Years’ essay in 1921 to celebrate the centenary of The Guardian and his 50th anniversary as editor there. Some hundred years later, facts are still sacred and open data has enriched those facts. Where journalism is telling a story, open data gives stories new perspectives, more credibility, and easier means to explain complex topics to the audience.
Journalists are increasing the value of data for the public by sifting through datasets. Adding visualisations and infographics further enriches journalists’ stories, and shapes new angles to narrate and discuss topics. In a way, data journalists are becoming data providers to the audience and are on the front line making data into knowledge. Still, datasets need context to turn into insights and stories worth telling. This is where semantic technology comes into play to ‘teach’ computers to better understand language and interpret data. The use of semantics in computing discovers relations between concepts and entities, disambiguates between those entities, and turns unstructured data into structured interlinked content. For example, the BBC has been using a Linked Data Platform to enrich and structure content with automated metadata-driven web pages.
Structuring and linking Open Data gives organizations and media publishers the opportunity to build ranking reports, draw trend lines, and offer relevant and insightful content. News on the Web (NOW), a free public service, showcases the opportunities a dynamic semantic publishing platform opens up for all news consumers – media companies, journalists and readers.
Technology can do the heavy-lifting with data but journalists are the ones who tell captivating stories, basing their critical analyses, breaking news stories or investigative reporting on open data, among other journalistic tools. The Washington Post has published an analysis, based on a wide range of public records, as well as interviews, and sought to identify for the first time every officer who has faced charges for fatal shootings in the US since 2005. The Washington Post Staff was recently recognized as The 2016 Pulitzer Prize Winner in National Reporting “for its revelatory initiative in creating and using a national database to illustrate how often and why the police shoot to kill and who the victims are most likely to be.”
The Guardian’s datablog turns the sometimes-boring data into graphics, interactive maps or interesting comparisons. For example, amidst the debate on the future of the UK steel industry, the newspaper website ran in early April the story “What the UK could buy for £1.5bn (instead of spending it on Tata Steel)”. The Guardian looked at how much the UK government has spent in other areas, and compared the costs. Some of the comparisons revealed that the £1.5 billion was enough for half a big new aircraft carrier or 50% of the annual BBC bill.
Using estimates and data by the World Health Organization (WHO) and the UN agencies, Bloomberg added visuals to a piece on global obesity and its rise in a report from April 2016. Bloomberg has a graphics section which contains visualizations on different topics, including a Presidential delegate count updated as precincts report the results of all primaries.
Last year, The Wall Street Journal created heat maps to show the impact of vaccines and how their introduction resulted in a visible decrease of infections across all US states.
It is not only open government data that news publishers use.
As we can see, some of the most renowned and well established news organizations use open data to add more content, dig for more stories and offer more angles to discussing topics. Other journalism endeavors became popular with the rise of technology and data openness. Three-time Pulitzer Prize winner ProPublica, or as its slogan says ‘Journalism in the Public Interest’, has used open data sources in many of its investigations.
MapLight uses open data and makes available for free use data to reveal money influence on politics. The website contains datasets on campaign contributions to each member of the US Congress; how each member of the Congress voted on every bill; and which interest groups and companies support and oppose key bills.
In the UK, OpenPrescribing allows GPs, managers and everyone to explore prescribing data in the country. Every month, the NHS publishes anonymized data about the drugs prescribed by GPs. OpenPrescribing makes the raw data files more user-friendly.
It is often the case that governments open sets of data that are sketchy and/or unwieldy. Sometimes databases are updated once a year, at best. At other times, datasets are of questionable relevance and importance to the general public. There are also times when governments are not willing to open certain data, which itself also tells a story.
Jonathan Stoneman, former BBC journalist and now visiting fellow at the Reuters Institute for the Study of Journalism, said last year that the open data community should talk to journalists and journalists should understand what open data is and what it can do for them. “They need to publicize each other; it’s a two-way street and is not happening,” he said.
In his working paper ‘Does Open Data Need Journalism?’, Stoneman writes: “Journalists will not feel the need to make greater use of Open Data until they see it as a rich seam of material, while data policy-makers won’t feel the need to improve the stream of data until they come under pressure to do so.”
The publishing and use of open data – especially outside the US, the UK and the Scandinavian countries – is still something new. And journalists working with open data are not the mainstream.
In its pursuit of truth, transparency and government accountability, journalism can benefit from the opening of more and more data. Governments and authorities, on the other hand, can start relying on journalism to promote the use of open data and its social and economic value.