3rd LEARN Workshop, Helsinki, June 2016

Open Data in a Big Data World

The 3rd LEARN (Leaders Activating Research Networks) workshop on Research Data Management, “Make research data management policies work” was held in Helsinki on Tuesday 28th June. I was invited wearing my CODATA hat (as Editor-in-Chief for the Data Science Journal) to give the closing keynote about the Science International Accord “Open Data in a Big Data World“.

The problem with doing closing talks is that so much of what I wanted to say had pretty much already been said by someone during the course of the day – sometimes even by me during the breakout sessions! Still, it was a really interesting workshop, with excellent discussion (despite the pall that Brexit cast over the coffee and lunchtime conversation – but that’s a topic for another time).

There were three breakout session possibilities, of which the timings meant that you could go to two of them.

I started with Group 3: Making possible and encouraging the reuse of data: incentives needed. This is my day job – taking data in from researchers, making it understandable and reusable, and figuring out ways to give them credit and rewards for doing so. And my group has been doing this for more than 2 decades, so I’m afraid I might have gone off on a bit of a rant. Regardless, we covered a lot, though mainly the old chestnuts of the promotion and tenure system being fixated on publications as the main academic output, the requirements for standards (especially for metadata – acknowledging just how difficult it would be to come up with a universal metadata standard applicable to all research data), and the fact that repositories can control (to a certain extent) the technology, but culture change still needs to happen. Though there were some positives on the culture change – I noted that journals are now pushing DOIs for data, and this has had an impact on people coming to us to get DOIs.

Next breakout group I went to was Group 1: Research Data services planning, implementation and governance. What surprised me in this session (maybe it shouldn’t have) was just how far advanced the UK is when it comes to research data management policies and the likes, in comparison to other countries. This did mean that me and my other UK colleagues did get quizzed a fair bit about our experiences, which made sense. I had a bit of a different perspective from most of the other attendees – being a discipline-specific repository means that we can pick and choose what data we take in, unlike institutional repositories, who have to be more general. On being asked about what other services we provide, I did manage to name-drop JASMIN, in the context of a UK infrastructure for data analysis and storage.

I think the key driver in the UK for getting research data management policies working was the Research Councils, and their policies, but also their willingness to stump up the cash to fund the work. A big push on institutional repositories was EPSRC’s putting the onus on research institutions to manage EPSRC-funded research data. But the increasing importance of data, and people’s increased interest in it, is coming from a wide range of drivers – funders, policies, journals, repositories, etc.

I understand that the talks and notes from the breakouts will be put up on the workshop website, but they’re not up as of the time of me writing this. You can find the slides from my talk here.

CODATA Mourns Former President, David Abir

David Abir

David Abir (1922-2016), CODATA President 1990-1994

The CODATA community recently learnt with regret of the death of David Abir, former CODATA President.

David Abir, was elected CODATA President at the 1990 General Assembly in Columbus, Ohio, USA. He served as a President for four years. He was an aeronautical engineer with special interest in properties of engineering materials and fluid dynamics.

‘During his Presidency,’ observes John Rumble, former President of CODATA, ‘Abir worked assiduously to ensure inclusiveness for all CODATA members and preparing CODATA to be a key participant in the emerging global information and connectivity revolution.’

In CODATA activities, he was prominent in the ‘Industrial Data Commission’:

This Commission was established in 1984 to guide the Executive Committee on the data needs of the industry. It was chaired by Jack Westbrook, a US physicist who had a long career as an industrial materials expert for General Electric, and included members who worked in the chemical, aeronautics, metallurgical, and other industries. Among other activities, the Commission conducted in 1985 an International Workshop on Materials Data Systems for Engineering in Schluchsee, Germany, that addressed industrial needs and recommended to start several new CODATA activities. One of its recommendations was to set up a Task Group on Materials Database Standards, later broadened to include other aspects of database management.

Industrial Data Commission

The CODATA Industrial Data Commission in the 1980s: David Abir is seated in the centre.

David Lide, who was an active member of CODATA during Abir’s era, writes: ‘I worked with David Abir for many years, both before and during his term as CODATA President. He made many contributions to CODATA, especially by encouraging more emphasis on data of engineering importance, in keeping with his career as an aeronautical engineer. Moreover, he was a voice of reason in the sometimes contentious debates over the directions that CODATA should take.’

Gordon Wood, also an active member during Abir’s era, writes: ‘Having first met David Abir at the CODATA Conference in Jerusalem in 1984, we worked closely together from 1990-94 as President and Secretary General respectively. David took his responsibilities very seriously and consistently sought what was best for CODATA’s future, both scientifically and organizationally. I will remember David as a gentleman and colleague.’

David Abir CODATA History

David Abir at the CODATA General Assembly in 1990

The Israeli National CODATA Committee wishes to point out some additional activities and achievements of Prof. Abir:

Prof David Abir (1922-2016) contributed during his long career as scientist and engineer to numerous activities for the benefit of his country and the scientific community. Starting in 1943, he served as a chief instructor of the aero club of Palestine; later he served in the Israeli Air Force, when it was formed in 1948; he was one of the founders of the Faculty of Aeronautical Engineering at the Technion, The Israeli Institute of Technology, where he was also the Dean of Faculty in 1962-64. He joined Tel-Aviv University in 1972, and was Associate Dean of the Faculty of Engineering (1972-1980). David Abir was a Fellow of the Royal Aeronautical Society, London, and a Fellow of The Institution of Mechanical Engineers, London, both since 1965.

Abir was deputy chairman of the Israel Space Agency, Ministry of Science and Technology (1983-7), and its director general in 1985-87. He was an active member of the Committee on Space Research (COSPAR) of ICSU. Until his retirement he served as the chairman of the National Committee for Space Research (1972- 2005) and a chairman of the Israeli National CODATA Committee. Abir continued to serve the scientific community for many years, even after his formal retirement.

Reuven Granot, a member of Israeli CODATA Committee: ‘As a member of The National CODATA Committee, I worked with Prof. David Abir during the last three decades. His kindness, openness to help and his willing to share his enormous experience and knowledge with others was outstanding. I shall remember David, as a close friend and scientific leader to follow.’

The CODATA Executive Committee and Officers wish to express their profound regret at the passing of this distinguished, highly-esteemed colleague and friend. Our condolences go to David’s family and friends.


Data Diplomacy: Political and Social Dimensions of Data Collection and Data Sharing

P1011328This post is by Angela Murillo, a member of CODATA Early Career Data Professionals Group, and Doctoral Candidate and Research Associate at the Metadata Research Center of the School of Information and Library Science at The University of North Carolina at Chapel Hill.

WUN – Data Diplomacy Workshop engages scientists to consider diplomacy in relation to scientific data.

On October 28-19th, 2015, I had opportunity to attend the WUN (World University Network) Data Diplomacy workshop titled “Data Diplomacy: Political and Social Dimensions of Data Collection and Data Sharing” organized by the University of Rochester and the Worldwide University Network. This workshop gathered a group of scientists and diplomats representing various disciplines who spent two days sharing experiences, expertise, and ideas in regards to data diplomacy. I was fortunate to attend as a representative of the CODATA Early Career Data Professional Working Group.

The workshop was held at the New York Academy of Sciences. For two days, we discussed the many aspects important to data diplomacy including:

  • What is data diplomacy?
  • Data Democratization
  • Data Governance and Regulation
  • Data Diplomacy for Data Curation
  • Data Sharing and Standards
  • Science Diplomacy

Dr. Timothy D. Dye, from the University of Rochester, School of Medicine & Dentistry organized and led the workshop. Additionally Dr. Jane Gateway from the University of Rochester Office of Global Engagement assisted with leading the program. Ten scientists and diplomats from around the world attending representing various disciplines including:

  • Political Science and International Relations
  • Computer Science and Information Science
  • Earth Science
  • Epidemiology

Participants of the WUN Data Diplomacy Workshop at the New York Academy of Sciences.

Through the discussion, we were able to establish 1) drivers for data diplomacy and 2) a working definitions for data diplomacy. Drivers for data diplomacy included a need to understand barriers and opportunities of complex data that is now available, as well as understanding how data is a driver of local and global acts, and how it allows for new relationships. Additionally we established some working definitions of data diplomacy including: 1) “data diplomacy creates, supports, or maintains technical or social relationships to mitigate barriers to action among stakeholders by enabling the use of data for societal benefit” and 2) “data diplomacy incorporates skills derived from diplomacy and data science with stakeholder needs, and recognizes that data itself is now an agent”. These definitions and attributes are being developed as we work on several publications in relation to Data Diplomacy.

Lastly, we developed a work plan which includes research to 1) describe the basic concepts of data diplomacy, 2) describe governance, standards, practices, and frameworks relevant to data diplomacy, 3) identify application of local diplomacy, and 4) create a set of use cases and identify empirical evidence of data diplomacy.

This workshop was a wonderful opportunity to contribute to an important discussion and new and evolving area of study; the intersection of diplomacy, data science, and stakeholders. Those who attended continue moving forward with our work plan, and hope to be able to report on publications as soon they are available.

CODATA TG Anthropometry and Special Populations activities at AHFE 2015

AHFE_2015Anthropometry is the science of measuring body dimensions, and has evolved over the last decades from taking linear measures to 3D data capture and processing.

The objective of the CODATA Task Group on Anthropometric Data and Engineering (the previous incarnation of the current TG) was to
promote dissemination and development of knowledge in anthropometry to contribute to the improvement of health, the safety and of the well being of all people. A major achievement of that TG to this end was to assist the establishment of WEAR (the World Engineering Anthropometry Resource) project.

The 2015 Applied Human Factors and Ergonomics (AHFE) Conference in Las Vegas, USA was
held under the auspices of 25 distinguished international Boards consisting of 583 members from 43 countries. The conference included 223 parallel sessions, with 2988 submissions from
researchers in 64 countries, working in academia, industry and government. There were 1420
paper presentations and 185 posters included in the conference proceedings. AHFE 2015 was
attended by over 1500 participants.

anthropometrics_banner_conference updatedThe CODATA Task Group on Anthropometry, Fit and Accommodation for Special Populations met in
conjunction with this conference and discussed the following issues:

In addition, presentations were given by:
Dr. Chang Shu, ‘Data processing and analysis for the 2012 Canadian Forces 3D anthropometric survey’.

This paper has been published in the HFES proceedings as Chang Shua, Pengcheng Xia, Allan Keefe, ‘Data processing and analysis for the 2012 Canadian Forces 3D anthropometric survey’, Procedia Manufacturing (6th International Conference on Applied Human Factors and Ergonomics (AHFE 2015) and the Affiliated Conferences, AHFE 2015): http://dx.doi.org/10.1016/j.promfg.2015.07.813

Kathleen Robinette & Daisy Veitch
Chairs of the CODATA Task Group on Anthropometric Data and Engineeringkathleen_robinette_daisy_veitch

Call for Papers – Data Science Journal

The Data Science Journal is a peer-reviewed, open access, electronic journal dedicated to the advancement of data science and its application in policies, practices and management of Open Data.

We are currently soliciting submissions for papers on a wide range of data science topics, across the whole range of computational, natural and social science, and the humanities. The scope of the journal includes descriptions of data systems, their implementations and their publication, applications, infrastructures, software, legal, reproducibility and transparency issues, the availability and usability of complex datasets, and with a particular focus on the principles, policies and practices for data.

All data is in scope, whether born digital or converted from other sources, and all research disciplines are covered. Data is a cross-domain, cross-discipline topic, with common issues, regardless of the domain it serves. The Data Science Journal publishes a variety of article types (research papers, practice papers, review articles and essays). The Data Science Journal also publishes data articles, describing datasets or data compilations, if the potential for reuse of the data is significant or if considerable efforts were required in compilation. Similarly, the Data Science Journal also publishes descriptions of online simulation, database, and other experiments, partnering with digital repositories on ‘meta articles’ or ‘overlay articles’, which link to and allow visualisation of the data, thereby adding an entirely new dimension to the communication and exchange of data research results and educational materials.

For further information, and to submit a manuscript, please visit http://datascience.codata.org/

CODATA Collection in Zenodo: Recent Reports

zenodo-gradient-1000For a little while now, CODATA has been using Zenodo as a repository for our most important reports, statements and some presentations.

Zenodo is an openly-available digital repository ‘launched within the OpenAIREplus project as part of a Europe-wide research infrastructure.’  See the About and FAQs for further information.  Like many innovative parts of the data infrastructure, Zenodo is still developing a sustainability model: we certainly hope that it is around for the long term.

Zenodo has a clean and attractive interface and it is easy to use.  Above all, we like it because it allows the creation of Communities or Collections, assigns DOIs and provides Altmetrics.

CODATA Reports in Zenodo

Below is a list of the recent CODATA publications in Zenodo.

The Value of Open Data Sharing: A CODATA White Paper for the Group on Earth Observations: http://dx.doi.org/10.5281/zenodo.33830

This White Paper was prepared for the GEO-XII Plenary in Mexico City by the GEO Participating Organization CODATA (the ICSU Committee on Data for Science and Technology). Through showcasing diverse benefits of open Earth observations data, the paper is designed to facilitate the process of transitioning from restricted data policies to more open policies for government data. This document was submitted to GEO-XII for its information and use. Supplementary case studies are most welcome.

CODATA Report: Current Best Practice for Research Data Management Policies: http://dx.doi.org/10.5281/zenodo.27872

Report on Current Best Practice for Research Data Management Policies commissioned from CODATA by the Danish e-Infrastructure Cooperation and the Danish Digital Library and submitted in May 2014.

CODATA Recommended Values of the Fundamental Physical Constants: 2014 http://dx.doi.org/10.5281/zenodo.22826

This document gives the 2014 self-consistent set of values of the constants and conversion factors of physics and chemistry recommended by the Committee on Data for Science and Technology (CODATA). These values are based on a least-squares adjustment that takes into account all data available up to 31 December 2014. The recommended values may also be found at http://physics.nist.gov/cuu/Constants/index.html

CODATA Recommended Values of the Fundamental Physical Constants: 2014 – Summary http://dx.doi.org/10.5281/zenodo.22827

This paper provides a brief summary of the work of the CODATA Task Group on Fundamental Physical Constants to produce the 2014 CODATA Recommended Values of the Fundamental Physical Constants.

CODATA Data Sharing Principles in Developing Countries: http://dx.doi.org/10.5281/zenodo.22117

The ‘Data Sharing Principles in Developing Countries’, or ‘Nairobi Data Sharing Principles’ were developed by participants of the CODATA Workshop on Open Data for Science and Sustainability in Developing Countries held on 6-8 August 2014 at UNESCO in the United Nations Offices in Nairobi, Kenya.

CODATA Uniform Description System for Materials on the Nanoscale v1.0: http://dx.doi.org/10.5281/zenodo.20688

Uniform Description System for Materials on the Nanoscale v1.0, prepared by the CODATA-VAMAS Working Group On the Description of Nanomaterials.

RDA Plenary 6: ‘Allez les filles’!

EGriffin-244x237 (1)This post is by Elizabeth Griffin, chair of the CODATA Data at Risk Task Group, and co-chair of the related RDA Interest Group on Data Rescue.

rda6_IMG_0461_(c)cap_digitalRDA Plenary 6 took place in balmy late September in the centre of Paris, within the confines of CNAM (Conservatoire National des Arts et Métiers) and just down the road from the République. Inaugurated in 1794, CNAM took over a deserted Priory and formally opened in 1802.

As well as housing a museum of innovations relevant to science and industry, it also (and primarily) serves as an adult educational centre, with emphases on practical training in science and engineering on the one hand and management and social sciences on the other. Whether it rose adequately to the challenge of a sudden influx of nearly 600 RDA delegates is more subjective.rda6_IMG_0605_(c)cap_digital

A marquee in the central court provided what should have been a good meeting point for meal-time discussions, but the wooden floor and harsh surrounds offered abysmal acoustics, encouraging many to retreat outside and perch on the stone window-sills (sunshine permitting). Whether delegates succeeded in locating and reaching the right meeting-rooms (either conventional classrooms or formal tiered theatres) allocated to their sessions depended on perseverance as well as physical fitness, offering a learning curve as steep as the flights of stairs. But none of that affected the characteristic-RDA zest of the meeting, which included the (now regular) admix of numerous IG, WG, BoF and formal plenary sessions.

800px-A_la_Gloire_de_la_République_FrançaiseWhat signs of the Liberté, Egalité, Fraternité, suggested by such proximity to the République? My French dictionary (1988 edition) offers no equivalent to `sorority’ or `sisterhood’, and a hard-hitting talk by Dame Wendy Hall to a Women-in-RDA breakfast meeting hinted that the emergence of women in both science and practical society is still Work in Progress. Women are climbing the scientific career-ladders, but not as quickly as some of the pointers anticipate. Where, then, is France?

Yet evidence of Progress abounded, even at the opening session: an impressive Keynote presentation by Barbara Ryan (Earth Observations, Geneva), “Unleashing the Power of Earth Observations – Together”, encouraged collaboration not competition, while the Minister of State for Digital Technology in the French Ministry of Economy, Industry and Digital Technology, who made a flying visit to give a welcome address, proved – much to our surprise – to be a young working mother. Allez les filles !

rda6plenary_axelle_lemaire_(c)_capdigital_550pxWomen carry quite a load between them at plenaries, this one being no exception. It would be instructive to see a gender disaggregation of Chairs, and to look for any bias between (for instance) those addressing social and educational matters on the one hand, and science, engineering and data management on the other.

Do women have the Liberté to select their chosen and fulfilling careers, or is true Egalité still a dream for a future that must coin a new word to represent the “togetherness” of our scientific efforts? Add does it matter? Yes, in some ways the matter is crucial.

800px-Place_de_la_République_-_ÉgalitéOrganizations, whether for research or for data management, share a common structure that is inevitably somewhat vertical, and in which actual power lies with those who are designers and builders. We don’t intentionally spend resources duplicating what is already extant; life is too short, and we accept and apply what is provided, be it a computer system, a hierarchy of data management, an educational set-up or a scheme for raising funds.

Even apart from some recognized differences in thought processes between “male” and “female” mind-types, other biologically-related forces simply cannot be ignored, like body language (oh for uniform attire!) or a reputed capacity for multi-tasking. And if there is one place that needs to embrace multi-tasking it is surely the RDA. Greater aggregation, or clustering, of the over-many topics that are now registered as Interest Groups would enhance the complementary aspects of a number of rather similar-looking sessions, while more plenaries would result in greater emphasis on the broader picture and thence on the raison d’être of the RDA.

Global Data Activities for the Study of Solar Terrestrial Variability

Alena_rybkina_300pxThis post is by Alena Rybkina, a member of the CODATA Executive Committee, of the CODATA Early Career Data Professionals Group, and a participant in the Task Group on Earth and Space Science Data Interoperability.

“On 28-30 September 2015, the joint workshop of the Scientific Committee on Solar-Terrestrial Physics (SCOSTEP) and the ICSU World Data System (ICSU-WDS) ‘Global Data Activities for the Study of Solar-Terrestrial Variability’ was held in the National Institute of Information and Communications Technology (NICT), in Tokyo, Japan.

The workshop mainly focused on data issues and data analysis. More than 60 participants shared their experience in data activities in the field of Earth-Space interoperability. From 49 presentations 21 were related to event data analysis, 4 to data science and 9 to data systems.


SCOSTEP is an ICSU Interdisciplinary Body tasked with the responsibility to organize long-term scientific programs in solar terrestrial physics. Variability of the Sun and Its Terrestrial Impact (VarSITI) is that program for the period 2014–2018. The VarSITI program will strive for international collaboration in data analysis, modeling, and theory to understand how the solar variability affects Earth.

VarSITI (will have four scientific elements that address solar terrestrial problems keeping the current low solar activity as the common thread: SEE (Solar evolution and Extrema), MiniMax24/ISEST (International Study of Earth-affecting Solar Transients), SPeCIMEN (Specification and Prediction of the Coupled Inner-Magnetospheric Environment), and ROSMIC (Role Of the Sun and the Middle atmosphere/thermosphere/ionosphere In Climate) (SCOSTEP-VarSITI Brochure 2013).

The Global Data Activities for the Study of Solar Terrestrial Variability workshop was launched with the panel discussion related to the VarSITI’s Data Challenges.

The discussions raised important topics:

1. Data needs: what other databases should be used or built to encourage Sun-Earth interdisciplinary research?
2. Data access: what efforts are required to make VarSITI generated/needed data open?
3. Data quality: what efforts are required to make VarSITI generated data reusable? Unify data and metadata formats?
4. Data legacy: where will VarSITI data be preserved? How WDS should treat VarSITI project observation data satisfying/not satisfying the WDS criteria?

The cross cutting point of the discussion was data accessibility and understandability for the community and researchers from the neighboring scientific fields. Involvement of CODATA to VarSITI project play important role and it was mentioned by the members of the Panel since adaption of high level principles is essential when it comes to the data management.

It was finalized with the signing procedure of the Memorandum of understanding between SCOSTEP, ICSU and WDS. SCOSTEP became the member of WDS.


As a member of the CODATA Executive Committee and the Task Group on Earth and Space Science Data Interoperability, I represented CODATA at this workshop. I gave an invited talk about CODATA activities during plenary session of the workshop and followed seminar in the National Institute of Polar Research (NIPR). Special attention was paid to the activities of CODATA TG “Earth and Space science data interoperability” as it is strongly related to the subject of the workshop. Another topic of interest among participants was Data Citation. Japan was one of the first countries that used DOIs (Digital Object Identifiers) to persistently identify observational data. That activity is among the top priorities of the data community and their experience play important role for future investigations. I will play a role therefore in connecting SCOSTEP and VarSITI with international activities to promote data citation.”

‘Cranking it out’: CODATA and RDA Plenary 6

The Sixth Plenary Meeting of the Research Data Alliance was held last week (23-25 September 2015) at CNAM, the Conservatoire national des arts et métiers.  This was the largest RDA plenary so far with roughly 700 participants.

rda6plenary_axelle_lemaire_(c)_capdigital_550pxA highlight for me was Barbara Ryan’s keynote and the impromptu and well-informed talk given by the French Minister of State for Digital Affaires, Axelle Lemaire.

This was a somewhat different Plenary for me than previous ones.  RDA Plenaries have a chaotic, ‘unconference’ feel, with a lot of ‘Birds of a Feather’ sessions to collect together interested parties and Interest Groups working out their activities, as well as Working Groups reporting on their progress.  It is an exciting model, with huge potential, but it also requires perseverance and patience; for people both to stand up for their ideas and to be willing to collaborate and compromise.  These are good things, of course.  Watching them evolve is fulfilling, fascinating and frustrating in equal measure, as it must be.

rda6plenary_intdataweek2016_zoom_(c)_capdigitalThis year was different because I am heavily involved in three very active and productive Groups and I had to focus on these and other meetings, at the expense of engaging with BOFs and other TGs or WGs.  I’m sure the cycle will swing round again, but this was a Plenary to be more focused and productive.

The CODATA community was very actively involved and there are a number of joint Working Groups and Interest Groups.  Here are the key activities:

RDA-WDS Interest Group on Cost Recovery for Data Repositories

RDA_Logotype_CMYKAlong with Ingrid Dillo of DANS and the WDS Scientific Committee, and Anita de Waard of Elsevier Research, I am one of the co-chairs of this co-branded RDA-WDS Interest Group.  The focus of the activity is to analyse current income streams for data repositories, to understand how they are changing and what new sources of income may be available if the repository needs to evolve its business model to ensure sustainability.  In depth interviews were conducted with some 25 data repositories and the findings of those interviews have now been written up into a draft report.

In a shared session with the RDA Interest Group on Domain Repositories, we presented the key findings of the report which include a landscape analysis of the types of income streams and a typology of the overall business models encountered.  Participants of the session then conducted a SWOT analysis of these business models.  The results of this will be published soon.

CODATA-RDA Interest Group on Legal Interoperability

rda6plenary_mark_parsons_(c)_capdigital_zoom2As Mark Parsons said in his introductory talk, the CODATA-RDA Interest Group has been cranking out the work!  At the Fifth Plenary in San Diego, we presented a definition and a set of high level ‘Legal Interoperability Principles for Research Data’.  Legal interoperability is an attribute which is important for the reuse of research data.

Legal interoperability occurs among multiple datasets when:

  • use conditions are clearly and readily determinable for each of the datasets,
  • the legal use conditions imposed on each dataset allow creation and use of combined or derivative products, and
  • users may legally access and use each dataset without seeking authorization from data rights holders on a case-by-case basis, assuming that the accumulated conditions of use for each and all of the datasets are met.

The principles identify the primary issues to be addressed to achieve such legal interoperability.  Over the last six months, the group has been developing a set of practical implementation guidelines for these principles.  The crank has been turned by many hands through a regular series of online calls.  The session discussed the draft document and laid out our schedule to deliver a completed set of Implementation Guidelines for the Principles on the Legal Interoperability of Research Data

CODATA-RDA Working Group on Data Science Summer Schools

This activity starts from the premise that contemporary research – particularly when addressing the most significant, transdisciplinary research challenges – cannot be done effectively without a range of skills relating to data. This includes the principles and practice of Open Science and research data management and curation, the use of a range of data platforms and infrastructures, large scale analysis, statistics, visualisation and modelling techniques, software development and annotation and more. We define ‘Research Data Science’ as the ensemble of these skills.

The CODATA-RDA Working Group on Data Science Summer Schools aims to address this recognised need for a means to develop additional skills through a scalable and consistent series of short courses or ‘Summer Schools’.  The model builds on existing CODATA activities, and brings together partners and with expertise and reusable materials to create a coherent whole that is more than the sum of its parts.  The partners include Software Carpentry, Data Carpentry and the Digital Curation Centre.

IMG_5511The first introductory Research Data Science Summer School will be hosted by the International Centre of Theoretical Physics, in Trieste, Italy, 1-12 August 2016.  The ICTP is generously providing accommodation and board for up to 120 participants.  Travel funding for 30-40 students has been secured from ICTP, TWAS and CODATA and discussions are ongoing with other sponsors and funders.

The session discussed the approach and included a call for further collaborators and funding.

In addition to these three Groups, with which I am directly involved as co-chair or facilitator, there were three other activities with connections to CODATA Task Groups.


The CODATA DAR TG-RDA Data Rescue IG held a joint session, entitled ‘The Data Corridor: You, The Past, and The Future’.

From the session description:

data_rescue_image_ruined_archives_IEDO_lowdefThe RDA/CODATA Data Rescue Group plans to publish a book on “Data Rescue”.  It will set the scene, describe and expand the rationale, define the benchmarks, and present selected Case Studies.  Which Case Studies to include can be a topic for open discussion at this session.  An added intention is that a (possibly annual) Newsletter (online) will be issued to update the Case Studies and report new ones.

The RDA/CODATA Data Rescue Group has also been charged with presenting “Guidelines” for rescuing data.  Activities involved in the recovery, digitizing, preserving of originals, plus the dissemination and archiving of the digitial versions, all need to be included in those Guidelines.  The Group will hold a Workshop on Data Rescue in Boulder in the Fall of 2016, where topics for the Guidelines will also be addressed.  However, the Workshop will probably appeal more to data producers while the RDA community tends to include more data managers.  This session will therefore seek input from aspects of data management regarding contributions to those Guidelines.

CODATA TG on Science and the Management of Physical Objects in the Digital Era

The CODATA TG on Science and the Management of Physical Objects in the Digital Era held a ‘Birds of a Feather Session’ on ‘Persistently Linking Physical Samples with Data and Publications: A Matter of Reproducible Research’ with a view to establishing an RDA Interest Group.  The objectives of the BOF were:

  1. To identify as many systems, both domain specific and cross domain, that are being developed to manage physical objects and data and publications derived from them.
  2. To facilitate international cooperation to develop harmonized approaches and best practices for physical object identification and digital curation.
  3. To build linkages between object repositories and museums, digital data repositories, scientific publications, and science communities.
  4. To enable the facilitation of object and sample identification infrastructure both at the national and international levels.

CODATA-RDA Interest Group on Materials Data, Infrastructure & Interoperability

Last but not least, the CODATA-RDA Materials Data, Infrastructure and Interoperability Interest Group, held a session to establish a Working Group on the International Materials Resource Registry (IMMR).

Historic launch of the Global Partnership for Sustainable Development Data

marshall_ma_smallThis post is provided by Xaiogang (Marshall) Ma, a core member of the CODATA Early Career Data Professionals Group (ECDP). He was the winner of one of the inaugural World Data System Stewardship Awards at SciDataCon 2014. Marshall is an Associate Research Scientist at Rensselaer Polytechnic Institute, specialising in Semantic eScience and Data Science. Check out his RPI Homepage here.

An information email in early September from Simon Hodson, the CODATA Executive Director, attracted my deep interest. His email was about the high-level political launch for the Global Partnership for Sustainable Development Data. I was interested because I have worked on Open Data in the past few years and the experience shows that Open Data is far more than a purely technical issue. I was excited to see that there would be such an event initiated by political partners and focusing on social impacts. I am grateful for the support from the CODATA Early Career Data Professionals Working Group, which made it possible for me to head to New York City to attend the forum in person on September 28th.

The forum was held in the Jade Room of the Waldorf Astoria hotel, and lasted for three hours from 2 to 5PM, with a tight but well-organized schedule of about 10 lightning talks, four panels and about 30 commitment introductions from the partners. The panels and lightning talks focused on why open data is needed, how to make data open and, especially, what and the value of Open Data for The 17 Global Goals for Sustainable Development and the social impact that the data can generate. I was happy to see that successful stories of Open geospatial data were mentioned several times in the lightening talks and the panels. For example, delegates from the World Resources Institute presented the Global Forest Watch-Fires, which provides near-real time information from various resources that can enable people to take prompt response before the fire runs out of control. During the partner introductions, I heard more exciting news about the actions that the stakeholders in governments, academia, industry and non-profit organizations are going to take to support the joint efforts of the Global Partnership for Sustainable Development Data. For example, the Children’s Investment Fund Foundation will invest $20m to improve data on coverage of nutrition interventions and other key indicators by 2020 in several countries; the DigitalGlobe commits to provide three countries with evaluation licenses to their BaseMap service as well as training sessions for human resources; the Planet Labs commits $60 million in geospatial imagery to support the global community; and the William and flora Hewlett Foundation is proposing to commit about $3m to the start-up support of the secretariat for a Global Partnership for Sustainable Development Data. A list of the current partners is accessible on the partnership’s website here.

Image from globalgoals.orgThe Global Partnership for Sustainable Development Data has a long-term vision for 2030: a world in which everyone is able to engage in solving the world’s greatest problems by (1) Effectively Using Data and (2) Fostering Trust and Accountability in the Sharing of Data. The pioneering partners in this effort have already committed to deliver more than 100 data driven projects worldwide to pave the pathway for the vision for 2030. For the first year, the partnership will work together to achieve these goals: (1) Improve the Effective Use of Data, (2) Fill Key Data Gaps, (3) Expand Data Literacy and Capacity, (4) Increase Openness and leverage of Existing Data, and (5) Mobilize Political Will and Resources.

historic_launch_prof_sanjeev_khagram_shorten_lightenThe forum was chaired by Prof. Sanjeev Khagram, with over 200 attendees from various backgrounds. The diversity of the attendees was partly reflected in the result of an online poll during the forum, which asked the participant to choose which goal that better data will make the most difference (see the result in the photo below).

historic_launch_online_poll_shortenDuring the reception time after the forum, I had a brief chat with Prof. Khagram about CODATA and also the Early Career Data Professionals Working Group, as well as the potential collaborations. He informed me that the partnership is open and invites broad participation to address the sustainable development goals. Prof. Khagram also mentioned that a bigger event, the World Data Forum, will take place in 2016. I also had the opportunity to catch up with Dr. Bob Chen from CIESIN, Columbia University about recent activities. It seems that ‘climate change’ is the topic of focus for several conferences in the year 2015, such as the International Scientific Conference, the Research Data Alliance Sixth Plenary Meeting and the United Nations Climate Change Conference, and Paris is the city for all these three events.

marshall_ma_zoomedThe report A World That Counts: Mobilising The Data Revolution for Sustainable Development, prepared by the United Nation Secretary-General’s Independent Expert Advisory Group on a Data Revolution for Sustainable Development, provides more background information about the Global Partnership for Sustainable Development Data.