This post comes from Alex de Sherbinin, Chair, CODATA Task Group on Global Roads Data Development and Associate Director for Science Applications, CIESIN, Columbia University
On 16-18 February I had the opportunity to join a distinguished group of science journalists for the 2nd Kavli Forum, organized by the World Federation of Science Journalists (WFSJ). WSFJ director Damien Chalaud requested that CODATA attend the workshop, and in discussions with Damien, his colleague Veronique Morin, and Lee Holtz, science editor for the Wall Street Journal, we outlined the contours of a talk. More on that below. But first a bit of background.
The meeting was organized in order to engage science journalists from major outlets such as the BBC, Science, Nature, Scientific American, National Public Radio, and major US network news outlets, in the broad topic of data journalism. Talks by journalists, computer scientists and researchers focused on tools that are available to journalists to be able to reduce the huge volumes of information available to them and to analyze data on their own. Example tools include Metro Maps, a tool designed to reduce complex stories into interlinked story lines, and the Overview Project, a content analysis tool designed to help journalists sift through mountains of electronic documents looking for story leads (e.g., the NSA documents leaked by Edward Snowden). Others introduced terms such as “geojournalism”, using online mapping and data analysis tools to tell the story of environmental change, and computational journalism, using computer programing to uncover stories.
The range of stories that had been uncovered, or at least told better, through data journalism was impressive. Stanford professor and journalist Cheryl Philips, described using publicly accessible records of infrastructure assessments done by the department of transportation in Washington state (USA) to map the most vulnerable bridges and to tell the story behind a bridge that collapsed, killing several people. John Bohannon of Science Magazine used iPython coding to send a fake journal article to close to 200 open access journals in a sting operation to uncover the lack of peer review of a clearly flawed article.
I was given the distinct honor of being the keynote speaker at the opening dinner. I used the Sustainable Development Goals (SDGs) of the United Nations’ post-2015 development agenda as a foundation upon which to build an argument on the importance of the Data Revolution for sustainable development. CIESIN has been involved in the effort to compute the price tag for monitoring the goals as a contribution to the Sustainable Development Solutions Network, so we have had a front-row seat in assessing the data needs.
The data revolution can be characterized as having two main elements: open data and big data. To build the case for open data, I described a few cases where environmental monitoring and data networks were either insufficient or were in danger of falling apart owing to lack of funds and inattention, including two water examples: the river gauge network of the Global Runoff Data Center (GRDC) and the UNEP-GEMS station-level water quality monitoring network. I pointed out that even in the case of air quality, which increasingly can be monitored from space, there is a need for ground validation based on in situ monitoring networks. I also described the benefits of open government data, and how such data has been found to stimulate economic growth and generate greater tax revenues than old school approaches of selling data.
I then turned to the promise and limitations of big data, aided by a useful primer by Emmanuel Letouzé of the DataPop Alliance. My central argument was that big data – defined by Letouzé as data emanating from our increasing use of digital devices, crowd sourcing or from online transactions, together with increasing computational sophistication and a community of analysts – has tremendous promise, but can never hope to fully supplant well-funded and well-functioning traditional data gathering systems such as census bureaus, national statistical offices, and environmental monitoring networks.
The discussion afterwards explored these issues and also enabled me to provide some data pointers for the journalists as they seek to employ data in their professional duties.