Author Archives: codata_blog

Takashi Gojobori: Towards CODATA Initiatives for Data-centric Sciences and Technologies for “Big Data” and “Small Data”

In the statement below, in support of his candidacy as CODATA President, Takashi Gojobori, presents his vision of CODATA initiatives for data-centric sciences and technologies for ‘Big Data’ and ‘Small Data’ which he describes as ‘beyond 50 years of CODATA’.  It is also available as a pdf.

Takashi Gojobori- PortrateVision for CODATA

Following its mission, CODATA works to improve the quality, reliability, management and accessibility of data of importance to all fields of science and technology, I would like to work, together with the elected EC members, to ensure that CODATA takes a strong initiative of dealing with Data-centric Sciences and Technology for the so-called “Big Data” and “Small Data.” Taking into account the fact that CODATA will celebrate its 50th anniversary in 2016, I would like to try to make CODATA more visible in global scale, in both developed and developing countries.

Challenging issues for CODATA

The most important challenge for CODATA in a practical sense is to carry out the CODATA Strategic Plan 2013-12018, as scheduled. Of course, I think that we may need some flexibility, depending upon any changes as time goes. However, this strategic plan was established through tremendous discussion and unified consensus among the CODATA members, so I would like to place greatest emphasis on carrying out this Plan.

Taking into account the CODATA Strategic Plan 2013-2018, I think that the following issues can be addressed in CODATA, under the common recognition that data is playing a central role in science and technologies. In particular, in order for us to take actual actions, I would like to try to lead to the situation that most issues listed here can be recognized as the common understandings among the CODATA members.

Importance and significance of Data-centric Sciences and Technologies

I believe that CODATA can take more active stance in promoting Data-centric Sciences and Technologies, recognizing its importance and significance.

Computing resource and networking for Data-centric Sciences and Technologies

There is no doubt that we should encourage all stakeholders to assure effective use of computing resource and networking for Data-centric Sciences. and Technologies.

Data explosion and care of “Small Data”

Although we know that the time of the data explosion called “Big Data” has come, we have also to ensure more careful care to “Small Data”, because they are also precious and indispensable in most cases.

Disaster and Data Management

We recognize that disaster issues demonstrate direct connections between society and data. Data management for disasters should continue in CODATA activities.

Environments and Data Management

In developing the economic activities, environmental factors have already become a central matter of debate between the developing countries and developed countries. But, debates should be conducted using scientifically justifiable data.

Data Standardization

It is obvious that data standardization is essential for ensuring global activities of science and technologies. Thus, CODATA should continue its global contributions to Data Standardization.

Data Visualization

It is getting more important to have Data Visualization to ensure further developments of Data-centric Science and Technologies and also to promote the reach-out activities to the Society.

Data citation and commercial publications

In collaboration with publishes, we need more detailed discussions on Data citation and commercial publications.

Data Consumption and Necessity of Archives

Some data are used just like consumption, whereas the other data should be kept as archives. We need clearer concepts and definition of those features of data. We have to discuss the issues on Data Consumption and Necessity of Archives.

Data literacy and Education of Data-centric Sciences

CODATA promote its activities particularly in the developing countries on Data literacy and Education of Data-centric Sciences and Technologies.

CODATA for Developing countries

CODATA contributes to developing countries by promoting educational activities. Moreover, economic development can be achieved through the formation of a data-oriented society, in which CODATA can be helpful in every aspect of the related domains.

CODATA and WDS

One of the most important tasks in the coming years is the close collaboration with WDS. The liaison between CODATA and WDS can be established to the stage of firm ground.

Data Journal of CODATA

The present Data Journal of CODATA can be transformed into a more active and practical form, taking into account the development of open access journals including well-known commercial publishers.

Discussion on CODATA Academy of Data Science

We should continue the discussion on the CODATA Academy of Data Science.

CODATA as Practicing Organization

CODATA can play a role in policy-making and its promotion of implement through various organizations and opportunities.

What I would like to try to achieve in the coming 4 years.

  • I would like to increase global visibility of CODATA by taking initiative of Data-centric Sciences and Technologies for the so-called “Big Data” and “Small Data”
  • I would lie to carry out the CODATA Strategic Plan 2013-12018, as scheduled.
  • I would like to deepen the intimate collaboration between CODATA and WDS.
  • I would like to conduct a 50 anniversary symposium for reinforcing the vision of the CODATA.
  • I would like to promote activities the Task Groups and Working Groups of CODATA.

In short, I strongly value the reason for CODATA’s existence is to help foster and advance science and technology through developing and sharing knowledge about data and the activities that work with data.

That is all.

Kostiantyn Yefremov: Statement in Support of Candidacy for CODATA Executive Committee

yefremov_bw_04a_smallI have accepted the proposal of CODATA TG PASTD to be nominated as a member of CODATA Executive Committee with respect and gratitude. I consider this as an opportunity to be more fully involved in the initiatives of CODATA, which are aimed at resolving global threats.

Let me briefly introduce myself. I am a specialist in the sphere of Computer Science and have been the head of the World Data Center for Geoinformatics and Sustainable Development (WDC-Ukraine) for 7 years. I also have MSc in Public Administration.

My active participation in the activities of CODATA started in 2008 when I jointed the Executive Board of the Ukrainian National Committee for CODATA and as the Deputy Сhair of the local organizing committee jointed the activities of the organization – 21st International CODATA Conference, which took place in Kyiv, Ukraine. Since 2010 I have been participating in the activities of CODATA TG PASTD (especially in data long term preservation and open access in Ukraine, open access to data for Sustainable Development).

I am also responsible for a set of questions on organization and coordination activities within interdisciplinary research initiated by Ukrainian National Committee for CODATA and WDC-Ukraine, related to the necessity of system compliance and analysis of various-natured data. There are some projects that I would like to point out. These projects are aimed at study, quantitative evaluation and the analysis of sustainable development processes of countries and Ukrainian regions in the context of quality and safety of human life.

My primary scientific and professional work is connected with developing approaches and methods for integration of data and applications, for designing and developing problem-oriented distributed systems and databases.

If elected to the Executive Committee, I will direct my efforts to developing and promoting methods and approaches to create open and permanently available information resource of scientific data related to sustainable development.

David Patterson: Statement in Support of Candidacy

This is the fourteenth in the series of short statements from candidates in the forthcoming CODATA Elections. David Patterson is a new candidate seeking election to the CODATA Executive Committee.  He is nominated by the International Union of Biological Sciences (IUBS).

David PattersonIUBS is delighted to have the opportunity to seek representation of the International Union of Biological Sciences on the CODATA Executive Committee. The new IUBS agenda is to promote ‘Unified Biology’, and we expect that biodiversity informatics will play a key role in achieving this goal. The development of infrastructure requires stable long-term funding, and community wide participation to provide the community as a whole with free, open, and relevant services. A close association with the CODATA through David Patterson’s participation in the Executive Committee will allow IUBS to call upon the expertise within CODATA to help guide the emergence of a unifying infrastructure for digital information, to strengthen the interactions within the bio cluster of ICSU, and help promote the biodiversity sciences within CODATA.

Taxonomy is the traditional discipline within biology that catalogues our understanding of the diversity of life, and forms the framework around which we organize much of our thinking about different types of organisms. One of David’s aims is to embed taxonomic logic and content in an open and free infrastructure that will be able to index distributed content about species in heterogeneous data environments and to organize that information as a first step to its analysis. He led the implementation team for the Encyclopedia of Life, a large scale project that successfully demonstrated how taxonomic expertise can be called upon to create an organizational framework for digital information on all forms of life.

He has been a member of the International Committee on Bionomenclature, is currently a member of the executive of IUBS, and a commissioner for the International Commission for Zoological Nomenclature. His academic career includes institutions in England, Australia and the US, and he has been involved in research projects relating to taxonomy, evolution, ecology and informatics in those countries and as part of the European funding. He was tightly associated with the drafting and release of the Bouchout Declaration (bouchoutdeclaration.org) which provides an opportunity for individuals and institutions in the biodiversity sciences to declare their support for open and free access to biodiversity information. He is currently active in the Research Data Alliance, and the development of the Biodiversity Data Interest Group.

David Patterson’s full CV is available here.

Der-Tsai (DT) Lee: Statement in Support of Candidacy for Re-Election to CODATA Executive Committee

Der-Tsai_Lee_(1)This is the thirteenth in the series of short statements from candidates in the forthcoming CODATA Elections. As a current CODATA Executive Committee member, DT Lee is seeking re-election.  He is nominated by the CODATA Committee of Academia Sinica, Taipei.

I would welcome this precious opportunity, representing CODATA Academia Sinica Taipei, to stand for the re-election to the CODATA Executive Committee. I have been personally involved with CODATA since 2007, serving as the Chair of CODATA, Academia Sinica, Taipei, until 2012, when I was elected to be a Member of the CODATA Executive Committee.

With the support of the General Assembly, following the 23rd CODATA Conference on “Open Data and Information for a Changing Planet” held in Taipei, in 2012, I was elected to serve on the Executive Committee. In the past two years as an EC Member, I have been involved in all dimensions of CODATA mission and served as the liaison to the Global Roads Task Group and to the Early Career Data Professionals (ECDP) Working Group on topics related to Data Science Capacity Building. I have encouraged the Global Roads Task Group to broaden usage scenarios of global ROADS Datasets, and supported ECDP discussions and development, addressing Data Science Capacity Building from CODATA’s perspective on a global level.

These activities are based on my past experience involving cultural heritage data preservation and dissemination of the Taiwan e-Learning and Digital Archives Program (TELDAP) for 12 years and interactions with international counterparts in North America (USA and Canada), Europe (Germany, Netherlands, Croatia and Ukraine), and South Africa.

Promoting New CODATA Initiatives in Cultural Heritage Data and Agriculture Data

With the expanded efficiency and agile leadership of the Secretariat, CODATA is now having deeper engagement with global scientific and data science community. Inspired by the ICSU Future Earth global research platform, if re-elected to serve on the Executive Committee, I would like to promote, connect and push forward, among other initiatives of CODATA, two major challenging tasks involving global culture heritage data and agriculture data.

Taiwan started to invest in digital preservation of culture heritage and the creative industries in 2002. During the decade-long national initiative, I served as the Deputy Program Director of TELDAP, sponsored by the National Science Council (now Ministry of Science and Technology). We had realized the power of information and communication technology (ICT) and utilized the Internet and Web to preserve and disseminate our cultural heritage and revitalize the past. So joining the respected European and Canadian colleagues in Canadian Heritage Information Network, Digital Cultural Content Forum & Culturemondo Network, Taiwan was one of the earliest participants of the global data commons with abundant collections and technologies. The global trend of “Linked Open Data” makes one step further and creates the “Deep / Data Web”. The cultural data such as those collected in Europeana.eu, Library of Congress National Digital Library Program of USA, TELDAP of Taiwan, and international museum networks, would re-shape the cultural landscape just as the Internet has done for us. Active minds with their creativity, professional training and collaboration experiences in cultural institutions, would be an important new strength to instil into CODATA community and enhance CODATA visibility.

Agriculture data, in my view, is another essential dimension for CODATA to connect with more disciplines, and involve many more countries. In some of the past Asia Pacific Advanced Network (APAN) meetings, Japanese colleagues had made efforts demonstrating and providing an insightful holistic vision to integrate agriculture, food and environment. For densely populated Asian countries, more governments and international organizations in this region are seeking to connect to one another with the help of the latest ICT development. How could CODATA play a more important role in engaging and connecting these efforts? Right now I am leading a Taiwanese agriculture-focused comprehensive university, and I have sensed this urgent need for global connection to change our world from the root and address the issues of sustainable development.

Promoting Open Source Software

I have been promoting the open source software movement with the establishment of an Open Source Software Foundry (OSSF) at the Academia Sinica, and open cultural data policy in the governmental level in Taiwan since 2002. We will invite Brazilian senior officer Jose Murilo (@josemurilo), Ministry of Culture in 2015 to share and exchange the 10 plus year experience of promoting Free Culture policy. In more than a decade, the progress of science is unleashing global societies in so many directions. As a witness and practitioner myself, I was elected Member of The World Academy of Sciences (TWAS) (2008), and was awarded Humboldt Research Award (2007), and appointed as Ambassador Scientist (2010) by the Alexander von Humboldt Foundation, Germany, and was a recipient of the Distinguished Alumni Educator Award, University of Illinois, Urbana-Champaign, Illinois, USA (2014). The awards connect me to broader communities, where I’d like to contribute and feedback to global society with a new data science vision.

Der-Tsai Lees Personal Profile

Dr. Der-Tsai Lee is Academician and Distinguished Research Fellow of the Institute of Information Science, Academia Sinica, and is currently on leave from Academia Sinica to serve in the capacity of President of National Chung Hsing University, Taiwan. He was the Chair and a Standing Committee Member of CODATA Taiwan National Committee since 2007 until 2012 when he was elected to be an Executive Committee Member of CODATA, International Council for Science.

He has held other positions as Distinguished Research Chair Professor, Department of Computer Science & Information Engineering, National Taiwan University, and Distinguished Chair Professor, Department of Computer Science and Engineering, National Chung Hsing University, Taiwan.

Recently he had led National Chung Hsing University, internationally well-known in agricultural science, to develop interdisciplinary collaborative research on agricultural big data. The integration of agriculture related environmental data, economic data, census and policy data, as well as biodiversity and genomic data in Taiwan, for example, could be a solid base for pursuing Asia wide international collaboration, where China, Japan, Thailand, Malaysia and other Asian countries have already held various conferences dealing with issues concerning sustainable agriculture in the past decade.

The decade long national initiative on digital cultural heritage, TELDAP, headquartered at Academia Sinica, is a crystal example of interdisciplinary collaboration between ICT and humanities and social sciences, encompassing digital archives and e-cultural data. His leadership in the past has set an exemplar of the collaboration among computer scientists and experts in natural science, humanities and social sciences. Serving as the liaison to the gROADS Task Group of CODATA, Dr. Lee got to know more about their work on geographic information science related work, and suggested that the Task Group include a new member, Prof. Ming-Der Yang, Director of the Center for Environmental Restoration and Disaster Reduction, National Chung Hsing University. With this addition, Taiwan’s environmental data related to natural disasters like earthquakes and typhoons, and technologies for sustainable development for disaster reduction, can be integrated. Those developments and endeavor are relevant to CODATA’s Strategic Plan 2013-18.

Dr. Lee helped establish the Research Center for Information Technology Innovation (CITI) of Academia Sinica, with a mission to facilitate cross-disciplinary collaboration between disciplines and between academia and industries. He continues to serve as a Standing Committee Member of CODATA National Committee, Academia Sinica, and forges the dialogue between IRDR, digital culture, science commons, and biodiversity resources, with a vision to encourage Linked Open Data enabled knowledge aggregation, analytics and data discovery.

As a globally well-respected senior scientist with international career trajectory, Dr. Lee would be able to involve and encourage stronger networking, collaboration and mutual support among CODATA National Members or International Scientific Unions and early career data professionals, in data policy, data citation, and data science and technologies. A more detailed biography of Dr. Lee can be found at http://www.codata.org/about-codata/executive-committee

Anil Kumar: Statement in Favour of Candidacy for CODATA Executive Committee

This is the twelth in the series of short statements from candidates in the forthcoming CODATA Elections. Anil Kumar is a new candidate seeking election onto the CODATA Executive Committee, although he has been strongly involved as the Convenor of the National Organising Committee of SciDataCon 2014.  He is primarily nominated by the Indian CODATA National Committee.

Dr Anil-Kumar_PuneLet me introduce myself. I am a physical chemist by qualification and the Chairman of Physical and Materials Chemistry Division of CSIR-National Chemical Laboratory situated at Pune, the central western side of India. I am also a member of the Indian National CODATA Committee constituted by Indian National Science Academy (INSA) New Delhi.

My major activities are related to physico-chemical properties of materials, liquids, melts and liquid crystals. I have devoted several years of my career in developing methodologies to correlate, predict these and thermodynamic properties by developing empirical methods. I strongly believe that the accuracy and dependability of the data are crucial to CODATA activities. Way back in 1998, I delivered an invited talk on the trustworthiness of thermodynamic data of materials in International CODATA conference held at New Delhi, India. I would like to devote my time and efforts to improve and develop newer methods to judge the quality of data in the area of chemistry of materials.

If I am elected to Executive Committee, I will intensify my efforts in developing methodologies and awareness among those collecting data to provide reliable data bank on materials.

Alena Rybkina: Statement in Favour of Candidacy for CODATA Executive Committee

This is the eleventh in the series of short statements from candidates in the forthcoming CODATA Elections. Alena Rybkina is a new candidate seeking election to the CODATA Executive Committee, although she has been centrally involved in the CODATA Task Group on Earth and Space Science Data Interoperability.  She is nominated by the CODATA National Committee for Russia.

IMG_3851Alena Rybkina is chief of the Innovation Technologies Sector of the Geophysical Centre of the Russian Academy of Sciences (GC RAS). She is young but internationally recognized specialist in implementation of modern information and visualization technologies in scientific research and industrial domain. Important goals of her activity are data technological studies and development of spherical projection systems aimed at efficient analysis, demonstration and popularization in data research and management.

Alena is actively involved in the operations of the CODATA Task Group “Earth and Space Science Data Interoperability”. She co-authored the “Atlas of the Earth’s Magnetic Field”, which was one of the outstanding Task Group achievements in 2013. She is experienced in organization of international and national events devoted to promotion of data science in Russia and other countries. In particular she was the principal organizer of the conferences “Electronic Geophysical Year: State of the Art and Results” in 2009, Pereslavl-Zalessky (http://egy-russia.gcras.ru/index_new_e.html), “Artificial Intelligence in the Earth’s Magnetic Field Study. INTERMAGNET Russian Segment” in 2011, Uglich (http://uglich2011.gcras.ru/index_e.html) and “Geophysical Observatories, Multifunctional GIS and Data Mining” in 2013, Kaluga (http://kaluga2013.gcras.ru/index_eng.html). She takes part in numerous international projects, including the ones developed by the International Institution for Applied System Analysis (IIASA, Laxemburg, Austria).

Alena is geologist currently working on the paleoenvironmental reconstructions and the Earth’s magnetic field studies. She took part in geological expeditions in Russia, Ukraine, France and Italy for collecting paleomagnetic data and providing correlations between changes in magnetic data and global astronomical cycles.

As an active young researcher she could become an efficient member of the CODATA Executive Committee with the focus on the organization and structuring the CODATA research projects and bring her experience in the geoscience data management as well as involving young scientists as she did for the Task Group.

Bonnie Carroll: Statement of Interest, Candidacy for Executive Committee

This is the tenth in the series of short statements from candidates in the forthcoming CODATA Elections. Bonnie Carroll has served on the CODATA Executive Committee since 2012 and is seeking re-election. She is nominated by the US CODATA National Committee.

Bonnie Carroll 1My first international CODATA meeting was the 1985 meeting in Jerusalem. Since then I have been involved with both International CODATA and the U.S. National Committee for CODATA. I’ve held positions within CODATA, including the program committee, symposium coordinator, speaker, U.S. National Representative and Co-Chair of the Data Citation Standards and Practices Task Group. In addition, it has been my honour to serve on the Executive Committee to International CODATA for the past two years.

Through all these years I have watched the importance of data as an asset grow in recognition and significance. Today in the fields of science we live in a data intensive world. For the last 40 years, CODATA has been an international resource and focal point for policy, standards, and practices in good data management. Now there are many organizations that have entered the field and deal with aspects of the data management lifecycle. I have been involved with several of these other organizations. In the US, I have been executive director for the White House Office of Science and Technology Policy Interagency WG on Digital Data; am the long-standing executive director of the federal interagency CENDI group, which addresses federal information S&T policy issues and programs; have been the executive secretary for the US delegation to the Global Biodiversity Information Facility; a member of the Board on Research Data and Information at the National Academy of Sciences; and many other US and international research data and information organisations and activities as the CEO of a private sector information management and consulting organisation, International Information Associates (IIa).

It is critical for CODATA to be both a leader of and a partner with these other organisations as we work to improve the stewardship of our data resources. We have only to look at the important example of our growing partnership with the ICSU World Data System and the establishment of the SciDataCon conference. As a member of the Executive Committee, I have been an active participant in both our strategic and operational deliberations. We need to continue to work on the Task Group structure and selection process to ensure that we are covering the important topics in data management. I believe that we should encourage more active involvement of the ICSU Scientific Unions. And we need to work hard to learn the lessons of the first SciDataCon, so that we make it the preeminent international data conference. If I am elected to serve another term, I will be committed to furthering CODATA as an effective and vital leader towards the future of data management.

Kassim Mwitondi: Statement in Support of Election to CODATA Executive Committee

This is the ninth in the series of short statements from candidates in the forthcoming CODATA Elections. Kassim Mwitonid is a new candidate seeking election to the CODATA Executive Committee.  He is nominated by the OCTOPUS Task Group, of which he is a co-chair.

ksm-passport-sizeMy quest to become an ordinary member of the CODATA Executive Committee is motivated by a series of events and personal experiences – particularly in the last couple of decades. For instance, it is now widely acknowledged that global challenges such as climate change, food security and terrorism can only be addressed in globally co-ordinated initiatives. Across the globe, data scientists have woken up to the realities of the need to develop novel analytical frameworks for coping with dynamics of modern day highly voluminous multi-faceted data. However, cohesive strategies for capturing, tracking and modelling such data are still in their infancy. Thus, one of my main motivations in applying for a place on the Committee is to get involved in CODATA’s long-term interdisciplinary initiatives to address global issues through data-driven research in a spatio-temporal context. As the current chair of the OCTOPUS – Task Group embarked upon mining space and terrestrial data for improved human livelihood, I am quite acquainted with ICSU-CODATA-WDS activities. For over fifteen years I have established strong interdisciplinary teaching, research and consulting relationships with colleagues across all continents. By joining the CODATA Executive Committee, I will bring not only a wealth of interdisciplinary skills in dealing with various phenomena affecting human livelihood through data modelling, but also extensive multi-cultural skills necessary for widening CODATA’s scope into new regions.

I have always perceived the Middle East and Africa as the missing link in the core activities of CODATA and WDS and so it is my vision to familiarise young scientists and researchers in those regions with CODATA and WDS core activities via OCTOPUS. It is my hope that such a vision will provide both capacity building and help fulfil CODATA’s initiatives – bridging the global scientific data digital divide and forging new frontiers in Data Science and Technology. I have already established strong working relationships with institutions and funding bodies such as the Qatar National Research Fund through its flagship programme – NPRP and the Wellcome Trust through its recently launched programme – DELTAS Africa. To cater for regional-specific needs, OCTOPUS has now split its research focus into two main streams – modelling of space-terrestrial phenomena and modelling socio-economic and cultural dynamics. One reason for this strategic move has been the fact that consequences of globalisation and urban life constitute a complex system the conceptualisation of which requires equally intricate data solution models. Human activities – physical or non-physical, urban or rural generate large volumes of data that can, using data acquisition and modelling techniques, be harnessed and converted into knowledge. In parts of the word, capturing, interpreting and monitoring dynamic interactions among urban data attributes relating to, say, diseases, socio-economic status, education, gender, crime, life style, diets, migration, urbanisation, globalisation, stress, pollution and many others have greater priority.

I obtained a PhD in Statistical Data Mining from the School of Mathematics of the University of Leeds in 2003 and I also hold an MSc-Informatics from Sofia (1991) and an MSc in Finance from the Strathclyde (1997). I am a member of several professional bodies and I am on editorial boards of several international journals and data repositories. My research interests are in developing enhanced methods for the extraction of knowledge from multi-faceted data related to various phenomena that affect human livelihood which fits in nicely with my vision above. I have published extensively in peer-reviewed journals and presented at national and international conferences across the globe. I was one of the first researchers to express interest in ICSU-ROA’s Health and Wellbeing Programme a few years ago with a concept paper on developing centralised adaptive data mining applications to uncover patterns, interactions and dynamics of health issues across the African continent. Between 2008 and 2011 I was part of an international consortium that characterised, documented and archived distributional properties of clay soil chemicals across the African continent.

Tim Dye: Statement in Support of Candidacy

This is the eighth in the series of short statements from candidates in the forthcoming CODATA Elections. Tim Dye is a new candidate seeking election to the the Executive Committee.  He is nominated by IUAES, the International Union of Anthropological and Ethnological Sciences.

Dye Pic LhasaThe “human” aspect of data is frequently absent or minimalized in technological and scientific communities, though we all directly experience the primacy of critical relationships at the local, national, and international levels that enable the sharing of ideas, methods, and resources. I represent the International Union of Anthropological and Ethnological Sciences (IUAES) on CODATA, and, as such, bring a two-fold emphasis on 1) the human relationship with data and 2) the value of curating, analyzing, and using unstructured (qualitative) data.

While we who work in informatics and data science often focus predominantly on technical and mechanical aspects of capturing, analyzing, and disseminating data, my own interests relate to the human relationships that surround this work – such as the political, social, and cultural aspects of data, data diplomacy, and frameworks for community ownership of data (for example, among indigenous populations). As a medical anthropologist, I am committed to exploring areas where the ethical generation and use of data (particularly but not only scientific data) can help improve the human condition around the world, respecting equity and justice.

Without deliberate inclusion of issues surrounding the human aspects of data within data-related policymaking, technological innovation, and global scientific organizations, we exacerbate inequity and waste of resources. The very existence of CODATA represents acknowledgement that international and inter-organizational cooperation is valued and central to global conversations around science and technology, and I believe that the continued infusion of social science and community perspectives within CODATA’s framework augments its ability to create cooperation and global value. These central values of CODATA align well with my own, and with the International Union of Anthropological and Ethnological Sciences.

Geoffrey Boulton: Science in a Data-Intensive Age

In the statement below, in support of his candidacy as CODATA President, Geoffrey Boulton presents his vision of Science in a Data-Intensive Age and the challenges and agenda this sets for CODATA.  This text is also available as a pdf.

GB PortraitThe last two decades have seen unprecedented growth in the capacity to acquire, store, manipulate and instantaneously transmit data. It is a world historical event that is changing the lives of individuals, societies and economies. It has major implications for science, research and learning that are far more profound and pervasive than those of the earlier, analogous revolution in data storage and human communication, that of Gutenberg’s invention of the printing press in the 1430s.

These developments offer profound challenges to science, and because of this to CODATA in adapting its historic role as the principal coordinator of data initiatives at the international level to focus on new, data-intensive ways of doing science.

Much of the challenge comes from so-called “big data”, which are “big” because of the volume that systems must ingest, process and disseminate; because of their diversity and complexity; and because of the rate at which data streams in or out of the systems that handle them. Terabyte-sized data sets are now common in Earth and space sciences, physics and genomics etc. The exploitation of these opportunities depends on the development and use of many technical solutions.

The data explosion and our capacity to combine, integrate and analyse large, varied and complex datasets offers powerful new ways of unravelling complexity, improving forecasts of system behaviour and detecting patterns in phenomena that have hitherto been beyond our capacity to resolve. It is the Google way of doing science. Such data-intensive science will at least complement the classical approach of hypothesis-theory-test. Some even argue that it will replace it. In any case it requires that we understand the mathematical and statistical basis of data manipulation. It is essential to develop new tools and new techniques to exploit this understanding, and to adopt new habits of working that have an ethos of open access to data in order to facilitate re-use, re-combination and re-purposing. Openness also facilitates more effective dissemination of scientific concepts and the evidence for them, in society and in education. It has the potential to change the social dynamics of science, and contribute towards the evolution of science as a public enterprise, rather than one conducted behind closed laboratory doors.

The Challenges that CODATA Must Address

These issues naturally pose major questions about the way science is done and also define the major issues to which CODATA should apply itself:

  • How do we understand the deep mathematical basis of “data science” and how can we articulate this with clarity for scientists, technologists, businesses, policymakers and the public? What does it mean to be a scientist and researcher in a digital age?
  • How can we maintain the open data principle in a data-rich world to ensure that the data underlying scientific concepts are open to scrutiny and replication or invalidation?  (The danger is that data manipulation takes place within a black box that is not open to scrutiny, in which case, science ceases to be science).
  • How can we incentivize and enable the data sharing, re-use and cooperation required to efficiently use multiple data sources and to address global challenges effectively?
  • How should we exploit the capacities of machines to use learning-based algorithms to match patterns and to interpret complex information in ways that are accessible to human cognition and thereby aid and not by-pass human creativity?
  • How is cyber-security to be maximized and how is personal privacy to be respected in the use of data for research?
  • How should the interface between publicly-funded science and rapidly advancing commercial data science be stimulated and exploited?
  • How are universities, institutes and other places where science is done best encouraged to be develop proactive and creative management of their data and their support for data-intensive science?
  • How should principles and processes of data science be embedded in scientific education and training?
  • What opportunities does the digital age offer for more inclusive, democratic ways of producing scientific knowledge and in playing a transformative role in society?

A bold vision for CODATA

These major tasks require boldness, vision and organization. They are so central to the issues of data use and integrity that the credibility and relevance of CODATA would be seriously in question were it not to address them. The priorities for CODATA should be the analysis, articulation and communication of answers to these high-level questions, and engagement with the national and international bodies that have the capacity to implement them. CODATA should continue to collaborate with bodies such as the Research Data Alliance (RDA) – which works to build solutions to promote data interoperability – and with the World Data System (WDS) – which ensures that data are managed and made available for the long term – with the objective of identifying and advocating those international norms and standards for which there is a need.

CODATA should be a forum to advance understanding of data-intensive science and to advocate solutions to questions that this developing science raises. It should be a means of stimulating a response to them within national science systems. It is well placed to do these things through the structures of the International Council for Science (ICSU), to which it reports, with its two orthogonal axes of membership:

  • The scientific unions in ICSU represent international science communities and articulate the principles and priorities of their disciplines. CODATA should work with them to promote understanding and change and to advocate good practice from those that have adapted to the data-intensive challenge.
  • The national representatives in ICSU are well placed to influence national scientific structures, priorities and education in data-intensive science. The work of scientists and their institutions is embedded in national systems of organization and funding to which the development of national data-intensive science needs to adapt. Collaboration with national members of ICSU will be a key in ensuring that national needs are respected whilst meshing with the international nature of science.

It will be the role of the President, working with the Executive Director and key committees to ensure that the above priorities are deeply embedded in the activities of CODATA.