Select Page

Plenary Session 3: FAIR Convergence

Plenary Session 3: FAIR Convergence

Nov. 30, 2020, 12:00-13:00 (UTC)

The FAIR principles have exercised a considerable influence on efforts to increase the utility of data and other digital objects for research by humans and for processing and analysis at scale using machines.  Progress is being made with the relatively low-hanging fruit of the F and the A: but being Findable and Accessible alone does not make data and other digital objects FAIR. The challenges of the I and the R are an order of magnitude greater. This session explores a number of initiatives seeking to build consensus about how to realise the FAIR principles as a whole, and in particular how we address the interoperability and reusability of data.

The session will introduce and explore the following significant contributions:

  • more precise specifications for FAIR Digital Objects, the FAIR Implementation Profiles and their combination in a Three Point ‘FAIRification’ Framework;
  • a range of work on FAIR vocabularies which are fundamental to the interoperability of data;
  • the draft DDI-CDI (Cross-Domain Integration) specification which by providing information about data structure and provenance aims to assist the ‘composability’ and integration of disparate data sets;
  • international activities to improve coordination around the digital representation of units of measure.

This panel serves to introduce ‘Strand 5’, through which symposium participants will be able to engage further with these developments in dedicated sessions throughout the week.

 

Speakers and abstracts

Barend Mons

Barend Mons (born 1957, The Hague) is a molecular biologist by training and a leading FAIR data specialist. The first decade of his scientific career he spent on fundamental research on malaria parasites and later on translational research for malaria vaccines. In the year 2000 he switched to advanced data stewardship and (biological) systems analytics. He is currently a professor in Leiden and most known for innovations in scholarly collaboration, especially nanopublications, knowledge graph based discovery and most recently the FAIR data initiative and GO FAIR. Since 2012 he is Professor in biosemantics in the Department of Human Genetics at the Leiden University Medical Center (LUMC) in The Netherlands. In 2015 Barend was appointed chair of the High Level Expert Group on the European Open Science Cloud. Since 2017 Barend is heading the International Support and Coordination office of the GO FAIR initiative. He is also the elected president of CODATA, the standing committee on research data related issues of the International Science Council. Barend is a member of the Netherlands Academy of Technology and Innovation (ACTI). He is also the European representative in the Board on research Data and Information (BRDI) of the National Academies of Science, Engineering and Medicine in the USA. Barend is a frequent keynote speaker about FAIR and open science around the world, and participates in various scientific advisory boards of international research projects.

Abstract: Theme of Convergence

The theme of ‘Convergence’ is key to the Symposium itself and especially the activities that will be introduced under the fifth strand of sessions. The GO FAIR International Support and Coordination office has been working with communities driving the bottom up convergence involving complex decisions among multiple science domains. In the past year great progress has been made both on the GO FAIR side in the work done around the FAIR Digital Object Framework, FAIR Implementation Profile or while running Metadata for Machines workshops; on the CODATA side in the work of Task Groups, such as DRUM, activities of DDI-CDI or around FAIR vocabularies. With the growing weight of Big Data, and broad impact of the FAIR Principles and ‘sealing’ the collaboration of the Data Together organisations earlier in 2020, there is a sense that widespread and rapid convergence is real for some critical elements of the Internet of FAIR Data and Services. As a point on the horizon we see interdisciplinary interoperability of data and applications to address the major intellectual challenges of the sustainable development goals.

Luiz Bonino

Dr. Luiz Bonino is  Associate Professor of the Services and CyberSecurity group at the University of Twente, Associate Professor of the BioSemantics group at the Leiden University Medical Centre and International Technology Coordinator of the GO FAIR International Support and Coordination Office. His background is in ontology-driven conceptual modelling, semantic interoperability, service-oriented computing, requirements engineering and context-aware computing. In the last 6 years Luiz has been involved in a number of activities to realise the FAIR principles, including the development of a number of technologies and tools to support making, publishing, indexing, searching and annotating FAIR (meta)data.

Abstract: FAIR Digital Objects Framework – Towards convergence of FAIR infrastructures

The FAIR principles defined a set of requirements for components in data and services infrastructures such as assignment of identifiers, explicit and clear relations between metadata and the objects they describe and rich metadata, among others. Current technology infrastructure lack proper support to some of these requirements, at least in a standardised way. The FAIR Digital Objects Framework (FDOF) is being designed to tackle these issues aiming at providing a set of blueprints to guide the technology evolution towards a better alignment with the FAIR principles. In this presentation the latest insights of the FDOF design will be discussed.

Erik Schultes

Erik Schultes is International Science Coordinator at the GO FAIR International Support and Coordination Office where he has been working with a diverse community of stakeholders to develop FAIR data and services. In an effort to accelerate broad community convergence on FAIR implementation options, Erik has coordinated and worked closely with  the GO FAIR community in the past two years to develop a machine-actionable FAIR Implementation Profiles (FIP) and scalable approaches to the creation of domain relevant, machine-actionable metadata (Metadata for Machine workshops, or M4M). Erik is also a member of the Leiden Center for Data Science at Leiden University. Erik is an evolutionary biologist with a data-intensive research focus. In addition to private consulting, he has held previous academic appointments at the University of California, Los Angeles, The Whitehead Institute for Biomedical Research at the Massachusetts Institute of Technology, Duke University, and The Santa Fe Institute.

Barbara Magagna

Barbara Magagna is a landscape ecologist working for Umweltbundesamt (Vienna) where she undertakes the function of a semantic analyst and database designer. She was involved in the development process and coordination of semantic artefacts such as SERONTO and EnvThes. She has experience in the design of XML schemas in the air quality data reporting area, in the design of reference models and data provenance tracking methods in projects related to Environmental Research Infrastructures (ENVRI). Recently she is contributing to the Virus Outbreak Data Network (VODAN) where she coordinates the development of FAIR Implementation Profiles. As a co-chair of RDA WG I-ADOPT she aims at developing an interoperability framework for the semantic representation of observable properties, which is also the topic of her ongoing PhD work at the University of Twente (NL).

Abstract: The Three-point FAIRification Framework (presented by E. Schultes and B. Magagna)

Under the urgency of the Corona pandemic, and driven by national and international projects dedicated to the creation of FAIR COVID data, a collection of FAIR technologies and methodologies have recently been consolidated into a general FAIRification framework. The Three-point FAIRification Framework provides a simplified pathway guiding the deployment of FAIR data and services by assisting the data producer in making informed choices regarding FAIR implementation. Each of the three points – creation of metadata, building a FAIR Implementation Profile and deployment of FAIR Data Points – will be covered in dedicated sessions of the Convergence Symposium as drivers of convergence. In this presentation, Schultes and Magagna give an overview of the Three-point FAIRification Framework and its recent use among numerous FAIR communities.

Alejandra Gonzalez-Beltran

Dr Alejandra Gonzalez-Beltran leads the Software Engineering Group at the Science and Technology Facilities Council (STFC)<https://stfc.ukri.org/>. STFC is part of UK Research and Innovation<https://www.ukri.org/>. Her work is around data models, methods and software tools to support research, its reproducibility and innovative scholarly communication. She works on the development of bespoke software systems to manage the experimental data produced by the large scale scientific facilities such as synchrotrons, lasers and muon and neutron sources, and also on supporting research software and FAIR (findable, accessible, interoperable and reusable) research data. She is a co-editor of World Wide Web Consortium Data Catalogue Vocabulary (DCAT), a Software Sustainability Institute Fellow and is involved in multiple working groups around research data and software. Before STFC, she was at the University of Oxford (Research Lecturer), University College London (Senior Research Associated), Queen’s University Belfast (PhD in Computer Science) and National University of Rosario, Argentina.

 

Simon Cox

Simon Cox leads the Environmental Information Infrastructure team in CSIRO. With a background in geology and geophysics, he has been working on standards for publication and transfer of earth and environmental science data since the emergence of the world wide web. He has engaged with most areas of environmental science, including water resources, marine data, meteorology, soil, ecology and biodiversity, focusing particularly on cross-disciplinary standards. His current work focuses on aligning science information with the semantic web technologies and linked open data principles, and the formalization, publication and maintenance of controlled vocabularies and similar reference data. The value of cross-disciplinary standards is to enable data from multiple origins and disciplines to be combined more effectively.

He is principal- or co-author of a number of international standards through Open Geospatial Consortium, ISO, and World Wide Web Consortium. Simon has held leadership positions in a number of organizations, including Dublin Core Metadata Initiative (Advisory Board), IUGS Commission for Geoscience Information (Executive Committee), Open Geospatial Consortium (Architecture Board, Planning Committee), Research Data Alliance (Technical Advisory Board), American Geophysical Union (ESSI Executive Board), alongside numerous positions on technical working groups and committees. His career at CSIRO has been supplemented by stints teaching at Monash University, and as a senior fellow at the European Commission’s Joint Research Centre.

Abstract: Guidelines for FAIR Vocabularies (presented by A. Gonzalez-Beltran and S. Cox)

Shared terminology is key for data sharing, harmonisation of datasets within and across disciplines and cross-domain data integration. Many organizations and disciplines have a tradition of curating lists of terms. These lists come in many forms, from more traditional printed-based word-list or glossaries to machine-readable hierarchical vocabularies or taxonomies, through to axiomatised ontologies.

Transitioning and adapting legacy vocabularies, from traditional forms rooted in print technologies, to more broadly accessible modes which are available ubiquitously on-demand is a critical high-value activity. The emergence of semantic technologies facilitates the transition to more accessible vocabularies, and brings the potential for FAIR vocabularies to be used in the context of much larger interconnected communities. This presentation introduces some guidelines for FAIR vocabularies, which are under development by a team working under the auspices of the DDI-CODATA metadata initiative.

Arofan Gregory

Arofan Gregory is a technology consultant working in the areas of standards for statistical and research data. His work spans the Statistical Data and Metadata Exchange (SDMX) standard, the Data Documentation Initiative (DDI) standards, and the Generic Statistical Information Model, among others. He is currently the convenor of the DDI group developing the DDI – Cross Domain Integration (DDI-CDI) specification, and is involved with organization of CODATA’s Decadal Programme.

Abstract: DDI Cross Domain Integration

DDI Cross Domain Integration (DDI-CDI) is a new type of specification, aimed at better enabling the FAIR sharing of data across domains, and for secondary use of data generally. Designed to complement existing domain standards, it is a platform- and technology-neutral specification based on a formal UML model for describing a range of data types (wide/unit record data, “tall” data/events/sensor data, multi-dimensional data cubes, time series and indicators, and key-value/no-SQL data). Further, it provides a detailed description of how processing and data fit together, forming a coherent, machine-actionable description of data provenance. By focusing on the role played by individual data points throughout the data provenance chain, it supports a wide range of data integration and re-use scenarios.

Robert Hanisch

Dr. Robert J. Hanisch is the Director of the Office of Data and Informatics, Material Measurement Laboratory, at the National Institute of Standards and Technology in Gaithersburg, Maryland. He is responsible for improving data management and analysis practices and helping to assure compliance with national directives on open data access.

Prior to coming to NIST in 2014, Dr. Hanisch was a Senior Scientist at the Space Telescope Science Institute, Baltimore, Maryland, and was the Director of the US Virtual Astronomical Observatory.

For more than twenty-five years Dr. Hanisch led efforts in the astronomy community to improve the accessibility and interoperability of data archives and catalogs.

Abstract: Digital Representation of Units of Measure (DRUM)

The DRUM Task Group (TG) within CODATA is tasked with facilitating discussions and highlighting pain points with the community about issues surrounding interoperability of units of measure. While there are many domain oriented representations/encodings trying to standardize the digital reporting of units in machines, these are disparate non-interoperable systems that pose a significant barrier to data interoperability. The problem is worse where such systems are not in place and so a fundamental activity needed to make units FAIR is required. This talk will discuss this topic, highlight some of the pain points, and suggest activities that attendees can get involved with to address the issue.

Stuart J. Chalk

Dr. Stuart J. Chalk is a Professor in the Department of Chemistry at the University of North Florida. Although trained as an analytical chemist, Dr. Chalk’s research now focuses on the areas of Chemical Informatics and Data Science. In particular, Dr. Chalk has projects focused on machine accessibility of solubility online enhancement to the IUPAC Gold Book, automated extraction and annotation of chemical property data from PDF files, and scientific data models. Dr. Chalk’s newest NSF funded grant focuses on semantic integration of heterogeneous datasets from toxicology, medicine, materials, biodiversity, and chemistry.

Abstract: Digital Representation of Units of Measure (DRUM)

The DRUM Task Group (TG) within CODATA is tasked with facilitating discussions and highlighting pain points with the community about issues surrounding interoperability of units of measure.  While there are many domain oriented representations/encodings trying to standardize the digital reporting of units in machines, these are disparate non-interoperable systems that pose a significant barrier to data interoperability. The problem is worse where such systems are not in place and so a fundamental activity needed to make units FAIR is required.  This talk will discuss this topic, highlight some of the pain points, and suggest activities that attendees can get involved with to address the issue.