On Friday 20 November at 15:00-17:00 CET, as part of a EOSC Co-Creation Project, CODATA and DDI held an online workshop ‘Applying DDI-CDI (Data Documentation Initiative-Cross Domain Integration) to the EOSC’. Although focussing on use cases for EOSC, the workshop is of wider interest as it examines the potential of the variable cascade and the way DDI-CDI deals with provenance information to assist in combining heterogenous data from multiple, unfamiliar sources.
Presentation by Arofan Gregory ‘DDI-CDI: An Introduction for Possible EOSC Applications‘
Short Introductory Presentation by Simon Hodson ‘DDI Cross Domain Integration: Applications for EOSC‘
Recording of the online workshop in Vimeo https://vimeo.com/482634508
DDI-CDI and EOSC
DDI and CODATA are collaborating on a new EOSC Co-Creation project, to complete in March 2021, which will be looking at the possible application of the new DDI-Cross Domain Integration (DDI-CDI) standard in an EOSC context. The project will be conducted as an investigation of technical requirements for data sharing, composability and integration that can usefully be met by implementing this model. A series of meetings with various EOSC groups is anticipated, including the EOSC Clusters, the FAIRsFAIR Metadata Catalogue Integration group, and the FAIR Working Group, whose Interoperability Framework is important for EOSC.
The first workshop introduced DDI-CDI and identify topics for further exploration with the stakeholders listed. We seek participation from appropriate technical people from EOSC-related activities (EOSC Clusters, metadata groups etc) who would be interested in engaging with us in this project, and who can help us to understand and explore what the use of DDI-CDI might best be in the different areas being addressed by EOSC.
DDI-CDI is a new type of metadata specification, which aims to make it easier to find, integrate, and share data across domain boundaries. It focuses on two areas: first, it looks at the structural integration of data, focusing on the roles played by atomic “datums” with those structures, and how they relate across different types of data sources. Second, it provides a framework to describe data provenance and processing at a granular level, so that the relationships between data sets/sources can be more completely understood as they evolve throughout the research process. It covers a wide range of data: traditional “rectangular” data, sensor data, register and event data, multi-dimensional data, time series, and no-SQL/”big” data. It describes structures in a way that allows for equivalence across and between different data sets to be clearly understood and used as the basis for processing and integration. Further, it provides an excellent foundation for semantic mapping.
The application of DDI-CDI within EOSC is expected to potentially take three basic forms:
- As a tool for supporting data discovery and assessibility, leveraging explicit relationships between data sets/sources
- As a tool for facilitating the ‘composability’ and integration of disparate data sets, based on common structural descriptions
- As a way of better supporting the use of data, by providing detailed context, based on processing and provenance information
The DDI-CDI specification is now undergoing a period of public review. This project will recommend how it could be used specifically within an EOSC context, and will also ensure that EOSC requirements can be supported by the model when it is released for production use in 2021.
Public review page: https://ddi-alliance.atlassian.net/wiki/x/IQBPMw
Complete download package: https://ddi-alliance.bitbucket.io/DDI-CDI/DDI-CDI_Public_Review_1.zip
Announcement at DDI Alliance website: https://ddialliance.org/announcement/public-review-ddi-cross-domain-integration-ddi-cdi