This post was written by Rebecca Lawrence, Managing Director of F1000. She was a session organiser at the CODATA 2019 Conference in Beijing, China
Scholarly research is a global enterprise, often requiring researchers to collaborate with relevant experts across the world. The data generated during research is a valuable commodity, and researchers are increasingly required to share the data created during their research endeavour.
However, institutional and funder data sharing policies are typically developed at organisational or national level. Meanwhile, researchers are often funded by several research agencies, working as part of a collaboration and/or with a myriad of data-related outputs. This makes adherence to the different data sharing policies complex, burdensome and time consuming. Furthermore, the practical ability to share data – in reputable repositories, with adequate metadata, in usable formats, at economic cost – can be onerous and in some case prohibitive for researchers. There also remain real cultural, social and economic barriers for many researchers to share their research data openly and in a timely manner.
The session at CODATA Beijing brought together experts representing many different parts of the world to discuss how we can work better together towards harmonisation of policies and incentives for researchers to share their data so that we can fully realise the benefits of making research data more available. Panelists were Jean-Claude Burgelman (European Commission), Rebecca Lawrence (F1000; chair), Xiaoxuan Li (Chinese Academy of Sciences), Erik Schultes (Dutch Techcentre for Life Sciences), Daisy Selematsela (University of South Africa (UNISA) Library and Information Services) and Nick Shockey (SPARC).
There was considerable discussion around making sure that we don’t repeat the mistakes of the road towards Open Access (OA) but rather that we learn from it. With data sharing, there was agreement that it is especially important that it is built on a Commons and not allowed to be built by a small number of commercial closed entities. This is especially important given that ownership of data can be far more powerful and restrictive than ownership/copyright of publications. There was also agreement that much of the slow progress towards OA has been caused by confusing and lenient policies. With data sharing we have the opportunity to ensure early on that clear and stringent policies are put in place to ensure that a middle-ground does not emerge along a similar vein to the ‘hybrid’ approach now prevalent in article publishing.
There was also much discussion about the significant, and growing, gap in data sharing practices between the global north and the global south. We need to not just talk about the positives that data sharing can bring but also (and maybe especially importantly in the global south) about the risks of doing nothing. We need to consider carefully the impact on equity and inclusion before we build data sharing systems, not after when it is too late to make a tangible difference. This is especially important when considering the business model for these infrastructures to ensure that approaches don’t develop that will cause further imbalance and inequity.
The importance of ensuring that researchers not only understand what data sharing is about but also why it is important and the potential positive impact it can have on them and their research should not be underestimated. We need to debunk the many myths about it and recognise that this process is going to take considerable time and effort to achieve.
Having said this, a significant shift towards greater data sharing will only come about if the rewards and incentives system starts to fully recognise the value of such activities and shifts away from the traditional sole focus on publications in high-impact venues. However, during such a shift, we also need to be alert to potential unintended negative consequences of any new system or approach.
Such a shift in incentives can be achieved not only by dangling ‘carrots’ but also by reducing researcher burden. For example, the increasing use of automated workflows and data capture was highlighted as potentially creating an incentive for researchers to share their data by removing the considerable time burden in capturing and curating the data and associated methods, whilst also increasing the likely level of FAIR-compliance of the resulting data.
There was a sense from the panel that like the early Internet, the current infrastructures being built globally to support data sharing are going to become another revolutionary global infrastructure that can ultimately be used by everyone. Like then, we don’t yet know what this infrastructure may become or what it may enable. But once we can demonstrate the possibilities and potential brought about by open data policies and implementation by working through a few pilots on a multinational scale, this will mobilise the international community very quickly, and help us to bring about the crucial alignment needed around core elements of data sharing policy and implementation.