This is the fourteenth in the series of short statements from candidates in the coming CODATA Elections at the General Assembly to be held on 27-28 October 2023. Jianhui LI is a candidate for the role of CODATA President. He was nominated by China, WFCC, Task Group on FAIR Data for DRR.
Unleashing the Value of Data to Accelerate Transformative Science for Global Challenges: The Evolving Role of CODATA
“Just 12 percent of the 169 SDG targets are on track, while progress on 50 percent is weak and insufficient. Worst of all is the fact that progress has either stalled or even reversed on more than 30 percent of the goals.”—reported by UN Secretary-General António Guterres on the Midpoint Review of the 2023 Agenda for Sustainable Development (25 April 2023)
The release of the Midpoint Review of the 2023 Agenda for Sustainable Development casts a shadow of uncertainty over the prospects of human sustainability by the 2030 deadline. The discouraging figures alert us that “the world is far off track” with the substantial distance that still needs to be covered to meet our goals. Ongoing conflicts further magnify this pressing challenge, such as the escalating impacts of climate change and the lingering impacts of the COVID-19 pandemic. The latest Intergovernmental Panel report on Climate Change states that global temperature has risen by 1.1 °C above pre-industrial levels and will likely exceed the critical 1.5°C tipping point by 2035. Catastrophic and intensifying heat waves, droughts, floods, and wildfires occur more frequently than ever. Even worse, the world is presently in the midst of the most extensive species extinction event since the age of dinosaurs.
The unprecedented challenges facing humanity call for “socially robust” science and technology. Therefore, a revolutionary research paradigm, known as transformative science, is emerging to fully grasp the changing research scenarios, mitigate uncertainty, accelerate knowledge production, identify critical research turning points, and pave the way for innovation. Such transformative science boosts open science across domains and regions by leveraging the benefits of cutting-edge digital technologies like big data, machine learning, artificial intelligence, etc. Data is essential at the forefront of all these technical applications. Thus, there is an urgent need for incremental investment in data, science-based tools, and guiding policies. As open science becomes a “new normal” bridging the gap between science and society, collaborative efforts among global stakeholders should focus on promoting data flow and knowledge production within and beyond the scientific community and across the public-private interface.
CODATA, the Committee on Data of the International Science Council, endeavors to connect data and people to advance science and improve our world. Throughout its nearly sixty-year history, substantial contributions have been made to data policy, data science, data skills, FAIR data, open science, etc. The organization currently boasts a stable and efficient governance structure with over 50 members. Building upon this membership base, the Executive Committee, Task Groups, Working Groups, and Secretariat work collaboratively to maintain close connections with worldwide entities to uphold its mission’s continuity. Taking full advantage of its International Data Policy Committee (IDPC) and others, the organization promotes principles, policies, and practices on data integrity, ethics, FAIR data, and data diplomacy. Furthermore, robust international partnerships have been built on data issues, embracing collaborations with WDS, RDA, BIPM, DDI, etc. To promote data interconnectivity and interoperability within the scientific community, CODATA also exemplifies its leadership in the WordFAIR project, the Global Open Science Cloud Initiative, etc. Moreover, CODATA builds a resilient and insightful platform for global data policies, resources, tools, skills, and domain best practices through collaborative activities, including international conferences, training workshops, and the Data Science Journal. These collective efforts demonstrate the substantial value of data across domains and regions, promote open data and knowledge sharing in the scientific community and society at large, and embrace diverse voices of data to narrow data gaps, thereby accelerating the progress of data-driven science.
The fundamental enabler of transformative science is a global data ecosystem that nurtures an open science environment where worldwide researchers can seamlessly collaborate to access, share, and analyze data and information to tackle pressing human challenges jointly. From my perspective, it should adhere to the principles of being Long-lasting, Inclusive, Knowledgeable, and Entrusted, abbreviated as “LIKE.” Such a LIKE data ecosystem is designed to ensure sustainability, allowing data to generate real and consistent value. Inclusivity encourages diverse participation from stakeholders across countries, regions, disciplines, and a broad range of social actors, ensuring the benefits of data are shared universally, inclusively, and equitably, thus “leaving no one behind.” This ecosystem also transcends the collection and curation of data to focus on generating meaningful information, knowledge, and wisdom, which is crucial for decision-making, scientific progress, and societal advancements. All these efforts should be built on trust, which is the fundamental element that backs up this framework. A LIKE data ecosystem should empower the transparent, trustworthy, and equitable use of data and information with ethical considerations.
The realization of such an ambitious vision requires considerable international and cross-disciplinary cooperation. To build a LIKE data ecosystem capable of effectively addressing the pressing challenges, CODATA plays an essential role in facilitating collective efforts among global stakeholders. It is imperative to cultivate stronger and more resilient cooperation networks across the globe. Working hand in hand with our international partners, we will collaborate to facilitate the seamless flow of global data, enable interoperability among various data platforms and commons, and contribute our shares to make this world a better place.
Twenty Years of My CODATA Engagement
Serving as the Deputy Director of the International Research Center of Big Data for Sustainable Development Goals (CBAS, http://www.cbas.ac.cn/en/), I am honored to lead the SDG Big Data Platform (https://sdg.casearth.cn/en), a pioneering initiative dedicated to managing petabyte-scale scientific data on cloud infrastructures. Our primary mission is to harness the power of big data to propel SDG research and monitoring, specifically focusing on SDG 2, SDG 6, SDG 11, SDG 13, SDG 14, and SDG 15. This hands-on experience has showcased the transformative potential of big data, reinforcing my firm belief in its significant role in advancing the achievement of SDG objectives.
My aspiration for the CODATA Presidency is deeply rooted in my long-standing connection with CODATA and its community and my unwavering passion for open science and open data. My first CODATA engagement was at the CODATA Berlin Conference in 2004, and then I fulfilled my obligations in the next fifteen years as the CODATA-China Secretary General (2008-2018), CODATA Executive Committee Member (2014-2016; 2016-2018), and the Vice President (2018-2023).
Drawing upon the first-hand experience gained from scientific revolutions within developing countries, I have actively joined several boards within and beyond CODATA to amplify the voices of these countries. These include my engagement in the UNESCO Recommendation on Open Science (2021), the discussion on the Global Open Science Cloud Landscape (2021), the review of the Beijing Declaration on Research Data (2019), and the Open Data in a Big Data World (2015). I also led the national-level data policy research, contributing to the birth of the Chinese national rules and Measures for Managing Scientific Data (2018) and the regulations of the Chinese Academy of Sciences on the Management and Open Sharing of Scientific Data (2019).
As a computer scientist and engineering practitioner, I have led the development of several data infrastructures for open-science service delivery. I promoted the construction of CAS scientific data infrastructure and open data platforms (2009-2018) for big data analysis and large-scale data-intensive scientific research. I’m in charge of the national research e-infrastructure in China (2018-), the CSTCloud (https://www.cstcloud.net), to provide integrated cloud solutions for life-cycle data curation in support of scientific discoveries. To break silos and facilitate collaborations among various research e-infrastructures spanning different domains and regions, I proposed and obtained seed funding to promote the “Global Open Science Cloud” (GOSC) Initiative locally and internationally since 2019. The GOSC Initiative is an integral part of the ISC CODATA Decadal Program “Making Data Work for Cross-Domain Grand Challenges.” Thanks to the joint efforts by CODATA and CNIC, CAS, over 200 experts from more than 40 nations and regions are coming together to co-build a global open science environment to connect trusted research e-infrastructures for innovative scientific discoveries.
Last but not least, as a professor at the University of the Chinese Academy of Sciences (UCAS), I attach great importance to capacity building in empowering the future. Thus, I wholeheartedly contribute to the training programs under the CODATA umbrella. I initiated the International Training Workshop for Developing Countries on Scientific Data, jointly sponsored by CAS, CODATA, CODATA-China, and other prominent international partners, such as TWAS, IRDR, etc. Seven of the series have attracted over 230 participants from nearly 40 countries, with a focus on less represented groups, including youth, women, and delegates from developing nations. A series of training programs helped CODATA gain better influence by absorbing new national members and ever-growing numbers of excellent CODATA alumni. Some trainees have grown into instructors in the following activities, and some actively engaged as CODATA national representatives. To further cultivate a dynamic data-sharing culture in the local research community, I launched the first bilingual, multidisciplinary, open-access data journal in China, China Scientific Data and the first open-data repository, Science Data Bank (ScienceDB, https://www.scidb.cn/en). Additionally, I initiated two annual conferences, the Annual National Scientific Data Conference (launched in 2014) and the International Symposium on Open Science Clouds (established in 2023), providing influential knowledge-sharing channels to connect the world for open science and open data dialogues.
Envisioning Future CODATA: My Tentative Roles in the Upcoming Presidency
We stand at a transformative era toward achieving the SDGs outlined in the 2030 Agenda. The essentials and potentials brought by research data are necessarily the driving forces in this evolutionary world. To fulfill the ISC’s vision to “advance science as a global public good”, CODATA prioritized four areas of data efforts as “making data work for cross-domain grand challenges”, “improving data policy”, “advancing the science of data and data stewardship”, and “enhancing data skills”. I fully endorse these prioritized areas. And I believe these can be nurtured in a LIKE (long-lasting, inclusive, knowledgeable, entrusted) data ecosystem, which is fundamental to enable transformative science for global challenges.
I am willing to lead the co-design and co-development of a practical roadmap to achieve all these goals above. Embracing insightful contributions from our members, representatives from the Executive Committee, Working Groups, Task Groups, and relevant stakeholders from around the globe, this roadmap shall align with mainstream global agendas, such as the Sendai Framework for Disaster Risk Reduction 2015-2030, the 2030 Agenda for Sustainable Development, and the Convention on Biological Diversity, etc. Innovative approaches, such as workstreams, could be adopted to converge diverse resources for productive service delivery. CODATA has established several task groups, working groups, and a series of flagship activities such as the WorldFAIR project, the Global Open Science Cloud Initiative, etc. However, the questions remain: How can we converge all these resources for rapid and robust responses in the changing world? How can we demonstrate the effectiveness of our data efforts in generating real value for society? It is imperative to explore innovative approaches, converging our resources to provide valuable and practical services in real-world scenarios. Looking into the future, it is highly valuable to invest our time and effort in exploring CODATA-branded solutions in global governance, especially in tackling global crises at hand and achieving sustainable development in the long run.
In addition, I will put more energy contributing to the following actions:
(1) Promoting global collaborations on interconnectivity and interoperability for resilient data infrastructures
Cutting-edge digital technologies, such as artificial intelligence and federated learning, represent powerful forces reshaping a new scientific paradigm. Robust and reliable data infrastructures should be in place to support the application of such technologies and encourage different scales of alignments among potential international partners, thus contributing to the unleashing of data value. I will help nurture disciplinary, regional, and international collaborations across data infrastructures. Built on our current accumulations on the WorldFAIR project, GOSC Initiative, etc, such data infrastructures should fully implement the FAIR principles, capable of achieving cross-domains and cross-boundary interoperability and generating valuable knowledge by AI technology. I will work with global partners to promote open science advancements and international research collaborations.
(2) Expanding a diversified membership base towards a global collaboration network
I propose fully leveraging our membership mechanism, which entails enhanced connections with international organizations, expanding our current membership base, amplifying our members’ voices, and nurturing a talent pool of young researchers. Building on our membership base, CODATA will foster strategic collaborations with more international, national, industrial, and other like-minded entities, such as the United Nations and its affiliated organizations and ISC’s members. It is equally important to embrace more national-level presences in Central and Southeast Asia, Africa, and Latin America to address the current geographical imbalance. Meanwhile, the voices of current members should be amplified by setting up a regular communication mechanism for transparent knowledge sharing. For instance, CODATA’s National Committee Forum should continue serving as an important venue. Effective pathways should be established for members to be more directly and actively involved in CODATA’s strategic activities. CODATA will serve as a bridge, creating opportunities to enhance collaborations among its members. For example, the China-U.S. Roundtable on Scientific Data Cooperation (2006-2014) showcased successful practices.
(3) Enhanced capacity building towards cultural building
Implementing a well-designed data ecosystem cannot succeed without substantial capacity building and education in data literacy. Therefore, increasing investments in capacity building should be highlighted, especially targeted to young researchers and representatives from developing countries, such as the Global South. CODATA has conducted several flagship training activities, such as the CODATA-RDA Data Schools, DDI-CODATA Training Webinar Series, and the International Training Workshops co-hosted with CNIC, CAS. Taking full advantage of these accumulations, a systematic framework of CODATA’s capacity building will be developed, including training resources, curriculum setting, evaluation criteria, etc. Guided by this framework, our members could jointly implement training activities to enhance our global impact further. With all these capacity-building efforts, we need to take one more step towards cultural building, conceptualizing and internalizing the culture of open science and open data throughout the overall research environment.
(4) Promoting sustainable growth and resource mobilization
As we look to the future, CODATA should take proactive steps to ensure its sustainable operation and growth, such as actively seeking diversified funding streams, optimizing resource utilization, and accelerating resource mobilization. Diverse funding sources may include international-level support from multiple sides, ensuring shared value and enhanced collaboration for open data beyond regional and national levels. To maximize our efforts, we could seek funding opportunities for initiatives that address global challenges, such as SDGs, climate change, disaster mitigation, etc. Moreover, to build up a more effective governing model, we may also encourage regional in-kind contributions from our member states and others. For instance, the development of the CODATA International Programme Office could be one of the practical solutions.
CODATA. 2019. The Beijing Declaration on Research Data. Available at: https://www.codata.org/uploads/Beijing%20Declaration-19-11-07-FINAL.pdf [Last accessed 9 October 2023].
CODATA. 2021. The Global Open Science Cloud. Available at: https://codata.org/initiatives/decadal-programme2/global-open-science-cloud/ [Last accessed 9 October 2023].
ISC. 2021. Science and society in transition. ISC action plan: 2022-2024. Available at: https://council.science/wp-content/uploads/2020/06/202110_ISC-Action-Plan_ONLINE.pdf [Last accessed 9 October 2023].
ISC, CODATA, IRDR, et al. 2020. Harnessing Data to Accelerate the Transition from Disaster Response to Recovery. Available at: https://council.science/wp-content/uploads/2020/06/Policy-Brief-Harnessing-Data.pdf [Last accessed 9 October 2023].
Science International. 2015. Open Data in a Big Data World. Paris: International Council for Science (ICSU), International Social Science Council (ISSC), The World Academy of Sciences (TWAS), InterAcademy Partnership (IAP). Available at: https://council.science/wp-content/uploads/2017/04/open-data-in-big-data-world_long.pdf [Last accessed 9 October 2023].
UN. 2015. The Sendai Framework for Disaster Risk Reduction 2015-2030, pp.13. Available at: https://www.undrr.org/implementing-sendai-framework/what-sendai-framework [Last accessed 8 October 2023].
UN. 2023a. The Sustainable Development Goals Report 2023: Special edition. Available at: https://unstats.un.org/sdgs/report/2023/The-Sustainable-Development-Goals-Report-2023.pdf [Last accessed 11 October 2023].
UN.2023b. Times of crisis, times of change: Science for accelerating transformations to sustainable development. Available at: https://sdgs.un.org/sites/default/files/2023-09/FINAL%20GSDR%202023-Digital%20-110923_1.pdf [Last accessed 8 October 2023].
UNESCO. 2021. UNESCO Recommendation on Open Science. Available at: https://unesdoc.unesco.org/ark:/48223/pf0000379949 [Last accessed 8 October 2023].
Wilkinson M, Dumontier M, Aalbersberg I. et al. 2016. The FAIR Guiding Principles for scientific data management and stewardship. Sci Data 3, 160018 (2016). DOI: https://doi.org/10.1038/sdata.2016.18.