Mission and objectives
Based upon a rapid growth in data acquisition rates reaching the petabyte scale the world experiences an intense demand in data curation capacities. Resources that are needed to build and maintain these capacities are often not sustainable, demonstrating a necessary change in data handling. Some of the most extreme and advanced examples of data curation represent facilities handling Big Data, due to their demand for cutting-edge concepts. The Task Group on the topic of Big Data Curation and Curation Sustainability will identify challenges as well as best practises in the curation of Big Data, consulting on possible approaches and existing learnings, and supplying recommendations, communicating closely with the International Data Policy Committee (IDPC) for utilizing synergies and ensuring knowledge transfer, for the curation of large amounts of data in the context of large-scale data facilities. This Task Group also raises the topic of FAIRness and FAIR values concerning Big Data, since the aspects of accessibility and reusability are partially contrarily intertwined with considerations on the manageability of Big Data through compression methods such as lossy compression. While loss-free data curation allows for ensured reusability, it also requires significant resources financially as well as environmentally. Therefore, the question of the necessity to save all data and the hierarchy of data dominating saving priorities for data arises in the scope of this Task Group with the aim of finding a compromise between data losses, curation capacities, scientific reproducibility and ensuring access to data globally.
Significance
The scientific merit and impact of the proposed Task Group will be demonstrated through the presentation of new best practices that will set the stage for initiatives in Big Data curation and sustainability. This contribution will advance methods and practices related to Big Data management, highlighting its value and importance. Furthermore, these guidelines will align with CODATA’s mission and the commitment to uphold the highest scientific standards.
Impact
Establishing a network of scientists from diverse disciplines will add significant value, fostering dialogue around diverse approaches to Big Data curation, often unexplored in their respective fields. This exchange of knowledge is expected to drive innovation, enhance the sustainability and impact of large scientific facilities, empower their stakeholders, and ultimately benefit society as a whole. Active collaboration in developing a common strategy will strengthen the Big Data community, resulting in forward-looking outcomes that extend beyond the immediate objectives of the Task Group.
The recommendations will enhance conditions for sustainable solutions for curating Big Data facilities and help improve and promote FAIRness in big data by achieving intersectional consensus. By approaching Big Data curation and searching for best practices and recommendations, we aim to provide scientists with the best measures to overcome their data acquisition limitations to allow the scientists to work unhindered on overcoming grand challenges of humanity.
Planned (and later on actual) activities and outputs for 2025-2027 (i.e. specific listing of outputs: reports, presentations, papers, events, etc.)
- a policy brief titled “Latest Advancements and Challenges of Big Data Curation and Curation Sustainability.” This document will be the result of in-depth discussions among members of the Task Groups, synthesizing insights from recent literature and best practices from leading institutions engaged in Big Data management. Experts from diverse fields (including astronomical observatories, particle accelerators, and large-scale photon facilities) will share their knowledge and exchange ideas to establish a coherent vision for future approaches to Big Data management.
- The Task Group will work as a single, structured team, moving together from topic to topic. The Task Force term will be separated into three phases:
- Phase 1 (Month 1-8): Development of a policy brief on “Latest Advancements and Challenges of Big Data Curation and Curation Sustainability”, which will be created upon a thorough review of up-to-date literature. The individual members of the Task Group will bring together a collection of literature and reports on the topic of Big Data curation, reviewing these and identifying concepts, ideas, and outlined challenges. After a document of findings of the respective literature has been created, the findings will be converted into the policy brief. Data policy recommendations will be directly related to the fields of science where big data is produced and managed, including astronomy, high-energy physics, and crystallography/material analysis.
- Phase 2 (Month 8-16): Dissemination of the policy brief to stakeholders, selected participation in high-level science diplomacy events, and direct communication with science policy partners of CODATA such as the Global Science Policy Unit of the International Science Council or the ICTP South American Institute for Fundamental Research. (ICTP-SAIFR)
- Phase 3 (Month 16-24): From the evaluation and consultation of the second phase of the Task Group, we will derive critical insights and perspectives, especially from the science policy side. These insights will be implemented during the third phase in the policy brief, developing a set of guiding recommendations on Big Data Curation and Curation Sustainability.
Contacts
Co-chairs:
Andy Götz, ESRF +EOSC-A, France
Kamil Dziubek, Universität Wien, Austria
The TG Secretary:
Nadine Fischer, ESRF, France
Page last updated: 2026-02-13.