The OSCARS CDIF-4-XAS Project has published its first deliverable: Overview of X-Ray Absorption Spectroscopy standards, vocabularies (and ontologies), data formats and practices: https://doi.org/10.
About the report
X-ray Absorption Spectroscopy (XAS) research has expanded to become a set of widely used scientific methods with applications across Physics, Chemistry, Surface Science, Nanoscale Science, Biology, and Environmental and Earth Sciences. Over time, various scientific communities, research facilities, device providers, and software developers have created different formats and applications to store XAS data and describe it with metadata. These custom data formats serve their specific purposes, but they are not easily integrated or interoperable. Consequently, reusing XAS data from diverse sources—whether for further research, AI training, or the reproduction and replication of results—can be challenging. This is due to the diverse ways in which data is presented across domains and the limited availability of XAS data and metadata in public repositories. Using or combining these datasets often requires expert intervention and manual steps to map and process the data. Establishing commonly accepted standards for publishing XAS data is the first step in addressing these challenges.
This document provides a landscape analysis of current practices, standards, vocabularies, ontologies, schemas, and data formats used in the generation, curation, and publishing of X-Ray Absorption Spectroscopy (XAS) data, with a particular focus on efforts to create community standards that facilitate interoperability. We begin with an overview of current XAS techniques and their areas of application. Next, we discuss the development of custom formats for storing data and efforts to produce analysis techniques applicable regardless of data origin. We describe current efforts to generate standards that facilitate data interchange and integration, including various initiatives from research consortia aimed at creating interoperable formats, ontologies, and vocabularies. We conclude by observing the emerging consensus around using NXxas for multi-spectra raw and processed data and XDI for single spectra data.
This landscape analysis is part of the CDIF-4-XAS project, aimed at piloting an improved model for XAS data interoperability and reusability across scientific disciplines through the Cross-Domain Interoperability Framework (CDIF) developed by CODATA as part of the WorldFAIR project. The CDIF-4-XAS project seeks to enable seamless integration of XAS data into data catalogues and analysis frameworks in a universally interoperable manner, making it easier to reuse, compare, and incorporate into larger studies or used for training AI applications. Therefore, we conclude by introducing the next phase of work, which will be to use CDIF to develop interoperability profiles to enable the integration and mapping of NXxas and XDI data: the CDIF-4-XAS interoperability model.
About the OSCARS CDIF-4-XAS Project
The CDIF-4-XAS project – Describing X-Ray Spectroscopy Data for Cross-Domain Use, will enable new science by making it easier to access, combine and reuse XAS data across research infrastructures (RIs) and disciplines.
CDIF-4-XAS will enhance the interoperability and reusability of XAS data by applying the Cross-Domain Interoperability Framework (CDIF), a set of guidelines and practices for using domain-agnostic standards to support the interoperability and reusability of FAIR data, especially across domain and institutional boundaries. By embracing FAIR principles, the project aims to streamline the sharing of XAS data, thus enabling more efficient data integration across RIs and scientific domains, including life sciences, chemistry, and environmental sciences.
The CDIF-4-XAS project is funded as a cascading grant by the OSCARS (Open Science Clusters Action for Research and Society) project through its first open call.
Funded by the European Union, Grant Agreement 101129751, through a Third Party Project Agreement.