Data Integration Initiative
During 2017, CODATA initiated and led a discussion with data science groups and international scientific unions and associations about the timeliness of a major initiative on interdisciplinary data integration. Meetings at the ICSU HQ in Paris in June 2017 and at the Royal Society of London in November 2017 produced a report and communiqué supporting a long-term initiative and outlining some of the essential issues to be addressed. The key priorities for this initiative are to address data integration in support of major global challenges and to develop relevant data capacities across all the disciplines of science.
Challenge and Purpose
The digital revolution of the past two decades offers profound opportunities for science to discover hitherto unsuspected patterns and relationships in nature and society, on scales from the molecular to the cosmic, and all in areas of human concern, from cultural artefacts and local health systems to global sustainability.
There is a major, largely unrealised potential to merge and integrate the data from different disciplines of science in order to reveal deep patterns in the multi-facetted complexity that underlies most of the domains of application that are intrinsic to the major global challenges that confront humanity. The challenge is that varying and incompatible data standards have been used across the different disciplines, along with inadequate definition of the vocabularies needed to categorise them. The result is that integration of diverse data can generally only be achieved within and between closely allied fields.
Characterising, understanding, and dealing with the complexity inherent in major global challenges will be integral to the mission of the new International Science Council that will come into being in the first week of July 2018; the first meeting of whose Governing Board will take place in late September 2018.
We plan to identify, promote and implement a programme of work that will substantially increase the capacity of the international scientific community to achieve rigorous, interdisciplinary integration of data to support work on major global challenges as a matter of routine. This will be a long-term, decadal initiative that has the potential to fundamentally enhance the capacity of science in the 21st century.
The communiqué of November 2017 expressed the agreement of meeting participants to work together with the broader research community to:
- develop and apply solutions for interdisciplinary data integration;
- pursue this through data integration for major global challenges that can also act as exemplars of its interdisciplinary potential;
- support, in parallel, the development of capacities to realise the potential of modern data resources across all the disciplines of science; and
- recognise that in many disciplines, foundational work is required to develop specific vocabularies ontologies and provenance tracking systems that are needed to enhance data discovery, use, interoperability and integration.
An ad hoc steering group was created to plan how these should be carried forward, comprising:
CODATA: Geoffrey Boulton – President; Simon Hodson – Executive Director.
ICSU: Heide Hackmann – Executive Director.
Application Domain leaders: Laura Merson – Infectious Disease Outbreaks; Virginia Murray – Disaster Risk Reduction; Stephen Passmore – Resilient Cities.
Data Scientists: Simon Cox – CSIRO; Lesley Wyborn – ANU; Bob Hanisch – NIST; Phil Archer – Consultant.
Supporting the steering group in making contributions to the initiative are: Gisbert Glaser – ICSU; Katsia Paulavets – ICSU; Bill Michener – DataONE; Kevin Blanchard – PHE; Philipp Ulbrich – Resilience Brokers; John Broome – CODATA.