Promoted panel: FAIR Data Stewardship – training and career opportunities

The Emerging Profession of Data Stewardship:

FAIR is about establishing a partnership between humans and machines. Although FAIR presents numerous technical challenges, the automation of semantically-rich data operations is not a technology problem alone. Human users, especially the domain experts whose knowledge must somehow be made machine-actionable, will always be indispensable in the creation of FAIR data and services. At the interface of this new human-machine partnership is the data steward, who has the skills and experience to bridge the purely technical universe of computer operations and the nuanced, dynamic and often vague domains of human knowledge. The data steward must navigate a range of complex areas ranging from information technology to policy and legal requirements to highly specialized domain topics.

Together, this panel will explore who exactly are these data stewards, what are their competencies and what systemic changes must be made to ensure attractive professional pathways for the special people who are key to the human-machine partnerships of the future.

Speakers and abstracts

Marta Teperek

Since June 2020 Marta Teperek is the Head of Research Data Services at TU Delft Library and the Director of 4TU.ResearchData. Between 2017 and 2020 she served as Data Stewardship Coordinator at TU Delft. Marta built a team of data stewards, appointed at each faculty at TU Delft to provide disciplinary data management support. Prior to joining TU Delft, she led the establishment and management of data support services at the University of Cambridge (2015-2017). While at Cambridge, she also initiated and oversaw the Data Champions programme and the Open Research Pilot. Marta Teperek is a researcher by training and completed her PhD in molecular biology and genomics at the University of Cambridge in 2014.

Abstract

In order to ensure widespread benefits of the European Open Science Could (EOSC), improvements in FAIR practices are necessary across all research disciplines. However, while the adoption of FAIR principles is increasing, researchers still know little about FAIR. In addition, recommendations, standards and expectations regarding FAIR are based largely on experiences and expertise of successful and engaged communities. While we cannot simply wait for all the disciplines and groups to find FAIR on their own, it is essential that policies and expectations truly reflect community needs and practices and that research communities are supported by disciplinary data stewards to implement them.

In this talk I will refer to the report by the FAIR in Practice Task Force of the European Open Science Cloud FAIR Working Group “Six Recommendations for Implementation of FAIR Practice” and the specific recommendations it makes with regards to the essential role of the data stewards in putting FAIR principles into practice.

Yann Le Franc

Yann Le Franc is the CEO and Scientific director of e‐Science Data Factory S.A.SU., Paris. He obtained a PhD in Neurosciences and Pharmacology in 2004. After a postdoctoral experience in the US, he worked on data management issues for Neurosciences in the context of the International Neuroinformatics Coordinating Facility (INCF) where he has been the coordinator of the INCF Belgian node. During this experience, he became an international expert on Semantic Web and ontology design. He chaired the EUDAT Semantic Working Group, and he is the co‐chairman of the RDA Vocabulary and Semantic Service Interest Group. Recently, he is contributing to the FAIRification and standardization of semantic artefacts in the context of FAIRsFAIR and OntoCommons, and became the technical coordinator of EOSC‐Pillar as contractor for the French National Research Data Archive and Computing Center.

Abstract: terms4FAIRskills

terms4FAIRskills is a collaborative project (https://terms4fairskills.github.io/) that aims to create a formalised terminology to describe the competencies, skills and knowledge associated with making and keeping data FAIR. Our goal is to allow stakeholders involved in FAIRification to retrieve specific training materials to allow them to improve their knowledge of the FAIR process and increase their FAIR skills and competencies. This will lead to an increasing awareness and capability in skills for FAIR data. This project is supported by the EOSC co-creation fund to develop a more formal version of the terminology which is aligned where appropriate with the FAIR Semantics recommendations from the FAIRsFAIR project and leverages best practice from the OBO Foundry.

In this presentation, we introduce a prototype version of this formalised terminology to support the enrichment of training content. The initial concepts defined in this terminology are built upon previous work such as FAIR4S, Edison and the ZonMW/ELIXIR training competencies. This first version of the ontology will be used to annotate training resources from several initiatives, including ELIXIR and the RDA/CODATA School of Research Data Science, using the Semaphora Chrome plugin derived from the EOSC semantic annotation service, B2NOTE. This process will allow us to test and enrich our ontology through the annotation of a first simple set of training resources.

This open-source semantic artefact will be a key component for the training community to enable the interoperability and FAIRness of training content across a large number of initiatives. It will provide a unique tool for universities and institutions to aggregate training materials in order to define new curricula to train researchers, data stewards and decision makers on the various aspects of FAIR data.

Valentina Pasquale

Valentina works as Research Data Management Specialist at Istituto Italiano di Tecnologia (Genova, IT), where she coordinates the set-up and development of Research Data Services to support scientists in data stewardship and open science. Since December 2019, she has been co-chair of the Data Stewardship Competence Centers Implementation Network (DSCC IN) in GO FAIR. Valentina has a background in Bioengineering and she holds a PhD in Humanoid Technologies from the University of Genova and IIT. She worked for more than 10 years in Neuroscience research before specializing in data management

Abstract: DSCC-IN contribution to the dissemination of FAIR data stewardship knowledge across countries

The main objective of the GO FAIR Data Stewardship Competence Centers Implementation Network (DSCC-IN) is to provide data stewardship centers with a cooperation structure to foster convergence and sharing of FAIR data stewardship knowledge across countries. The network involves around 40 members from all over the world, whose capabilities, needs, and expectations are highly heterogeneous. DSCC-IN commitment to the production of FAIR data stewardship guidelines is realized in the participation to the GO FAIR 3-point FAIRification framework (3PFF) working groups, a wider initiative whose main goal is to develop methods, tools and documentation around the 3PFF. The role of DSCC-IN will be to facilitate the sharing and adoption of the 3PFF in different national contexts. At the same time, DSCC-IN members collaborate on engaging researchers, delivering data stewardship trainings, and proposed an international project about how to combine research and data stewardship training in a comprehensive programme for the data stewards of tomorrow.

Erik Flikkenschild

Educated as an electronic engineer. Focus on deployment computer science. From 1980 employee of central IT department of the LUMC, started as a system manager of the BAZIS Hospital Information System. With special attention to privacy and security participating in EU programs (AIM / Seismed / CEN256 wg6, with prof. A.R. Bakker and Dr. C.P.Louwerse). Till 2013 in function as IT manager (IT infrastructure management and development). Team member of the LUMC Personal Medicine program Cura Rata (2012, with prof. Daan Hommes). From 2013 Information Manager with focus on Research innovation. First task was to develop a FAIR LUMC Strategy study research (with prof. Jeanine Houwing and prof. Barend Mons). Member of Research IT Program implementation core team (2014 – 2018). Co-writer of the national NFU-Data4Lifesciences program and member of the operational board (till 2019). For 10 years active within the biobank world (BBMRI) and the Parelsnoer Institute (PSI) in the role of national ICT coordinator and Security Officer. Since 2013 member of the national ZonMw commission Goed gebruik Geneesmiddelen .From 2017 team member of the Surf VRE architecture team developing a national reference architecture framework and glossary (work package leader). Since 2017 chairman of the national working group (SIG) trusted data connections which is tightly connected to the national programs LCRDM and Data4Lifesciences (became part of Health-RI). From 2019 focus on IT deployment FAIR principles, team membership: Data Competence Centres (Go-FAIR involvement. Coordinator FAIR Data Points deployment (VODAN IN Clusters, FAIR Data Point ) in collaboration with Dr. Erik Schultes (3 point framework for Fairification). Vodan Africa LUMC participation (with prof. Miriam van Reisen)

Abstract: Experience with establishing a national network of skilled Data Steward workforce; Erik Flikkenschild, LUMC

The LUMC implemented an institutional FAIR Data point in august 2020. The Vodan in a box was the used toolkit. The challenge was to involve and train IT experts and data scientists in FAIR methodology. The human Genetics department already had experience with FAIR methodology and gave supported implementing the FAIR principles (technical metadata). We used the VODAN FAIR Implementation Profile (version 1) and updated this FIP with our FDP choices. Parallel Vodan Africa had also implemented Vodan in a Box. We created a joined (Vodan-Africa- LUMC) Implementation Network of experts (DCC) and succeeded in connecting all the FDP’s (demo intercontinental query). The talk reports on how and when we collaborate, exchange knowledge and how we provided practical implementation tips (hurdles to avoid).
Collaboration with national expert workforces: FAIR deployment requires to connect with the local culture and methodology. The benefits of an architectural approach (incorporating the FAIR methodology within the Togaf methodology) is briefly introduced.

Take away messages:
1. IT data stewards can implement FAIR Data Stations with the VODAN toolbox (FIP questionnaire). Precondition is knowledge transfer from (M4M) trained Fairification data stewards, in a workshop approach.;
2. A first step in connecting Personal data requires the involvement of enterprise architects preferably organized in a national DCC context;
3. Getting things done starts in small taskforces reporting in a regularly plenary meeting

VENKATARAMAN Shanmugasundaram

After several years spent curating a model organism database and performing biomedical imaging, Venkat moved into the field of research data management. For the last few years, he has been at the Digital Curation Centre providing a number of services as their lead trainer, running in-person and virtual workshops, a research data specialist, occasional consultancy services and been involved in creating online training materials including a recent MOOC on developing RDM services. He is also a co-chair of the CODATA/RDA Schools of Research Data Science. In both roles, he has taught many students and researchers around the world in RDM best practice, open research, FAIR principles and related subjects. As well as teaching, he is involved in consortia and organisations such as OpenAIRE, FOSTER, FAIR4Health and RDA. In the latter, he also helped establish, and is a co-chair of, the RDA WG “Raising FAIRness in Health Data and Health Research Performing Organisations”.

Abstract:

The FAIR principles, although relatively new, have been embraced at an impressive rate and ever increasing. Those that champion its benefits have now also been able to identify the gaps that need to be filled in advocating them further. One such gap is data stewardship, with the realisation that this an FTE in itself and that there is a need to train individuals to meet the requirements. The SRDS’ existing core curriculum aimed at early career researchers was identified as a starting point to train data stewards but required further development in itself. Borne out of this was the data stewardship strand, with a focus to build sustainable capacity and has extensive new material. The emphasis is to provide a service providers perspective and can be deployed either as a standalone module or as part of a wider RDM and open research curriculum, but is ideally complemented by the SRDS core curriculum. The data stewardship strand was developed through funding from FAIRsFAIR and successfully piloted in 2019, and will be taught twice further in 2020 and aims to be deployed in the future directly through the SRDS and through franchising. This presentation will take the viewer through the curriculum in further detail.