Category Archives: Data Science Journal

Posts relating to the data science journal

3rd LEARN Workshop, Helsinki, June 2016

This post is a syndicated copy of the one at and was written by Sarah Callaghan, Editor-in-Chief of the Data Science Journal

Open Data in a Big Data World

The 3rd LEARN (Leaders Activating Research Networks) workshop on Research Data Management, “Make research data management policies work” was held in Helsinki on Tuesday 28th June. I was invited wearing my CODATA hat (as Editor-in-Chief for the Data Science Journal) to give the closing keynote about the Science International Accord “Open Data in a Big Data World“.

The problem with doing closing talks is that so much of what I wanted to say had pretty much already been said by someone during the course of the day – sometimes even by me during the breakout sessions! Still, it was a really interesting workshop, with excellent discussion (despite the pall that Brexit cast over the coffee and lunchtime conversation – but that’s a topic for another time).

There were three breakout session possibilities, of which the timings meant that you could go to two of them.

I started with Group 3: Making possible and encouraging the reuse of data: incentives needed. This is my day job – taking data in from researchers, making it understandable and reusable, and figuring out ways to give them credit and rewards for doing so. And my group has been doing this for more than 2 decades, so I’m afraid I might have gone off on a bit of a rant. Regardless, we covered a lot, though mainly the old chestnuts of the promotion and tenure system being fixated on publications as the main academic output, the requirements for standards (especially for metadata – acknowledging just how difficult it would be to come up with a universal metadata standard applicable to all research data), and the fact that repositories can control (to a certain extent) the technology, but culture change still needs to happen. Though there were some positives on the culture change – I noted that journals are now pushing DOIs for data, and this has had an impact on people coming to us to get DOIs.

Next breakout group I went to was Group 1: Research Data services planning, implementation and governance. What surprised me in this session (maybe it shouldn’t have) was just how far advanced the UK is when it comes to research data management policies and the likes, in comparison to other countries. This did mean that me and my other UK colleagues did get quizzed a fair bit about our experiences, which made sense. I had a bit of a different perspective from most of the other attendees – being a discipline-specific repository means that we can pick and choose what data we take in, unlike institutional repositories, who have to be more general. On being asked about what other services we provide, I did manage to name-drop JASMIN, in the context of a UK infrastructure for data analysis and storage.

I think the key driver in the UK for getting research data management policies working was the Research Councils, and their policies, but also their willingness to stump up the cash to fund the work. A big push on institutional repositories was EPSRC’s putting the onus on research institutions to manage EPSRC-funded research data. But the increasing importance of data, and people’s increased interest in it, is coming from a wide range of drivers – funders, policies, journals, repositories, etc.

I understand that the talks and notes from the breakouts will be put up on the workshop website, but they’re not up as of the time of me writing this. You can find the slides from my talk here.

Call for Papers – Data Science Journal

The Data Science Journal is a peer-reviewed, open access, electronic journal dedicated to the advancement of data science and its application in policies, practices and management of Open Data.

We are currently soliciting submissions for papers on a wide range of data science topics, across the whole range of computational, natural and social science, and the humanities. The scope of the journal includes descriptions of data systems, their implementations and their publication, applications, infrastructures, software, legal, reproducibility and transparency issues, the availability and usability of complex datasets, and with a particular focus on the principles, policies and practices for data.

All data is in scope, whether born digital or converted from other sources, and all research disciplines are covered. Data is a cross-domain, cross-discipline topic, with common issues, regardless of the domain it serves. The Data Science Journal publishes a variety of article types (research papers, practice papers, review articles and essays). The Data Science Journal also publishes data articles, describing datasets or data compilations, if the potential for reuse of the data is significant or if considerable efforts were required in compilation. Similarly, the Data Science Journal also publishes descriptions of online simulation, database, and other experiments, partnering with digital repositories on ‘meta articles’ or ‘overlay articles’, which link to and allow visualisation of the data, thereby adding an entirely new dimension to the communication and exchange of data research results and educational materials.

For further information, and to submit a manuscript, please visit

Introducing the new Data Science Journal Editorial Board

dsj_coverThis post comes from Sarah Callaghan, new Editor-in-Chief of the Data Science Journal, recently relaunched with Ubiquity Press.

It is my great pleasure to be able to introduce the new editorial team for the Data Science Journal. We have gathered an exceptional team, with members from all around the world, covering data science topics as diverse as data stewardship, databases, large scale data facilities, data visualisation, geospatial aspects of data, semantics, data policy and much, much more. Our editorial board members also bring expertise in research fields such as (but not limited to) Earth sciences, libraries, scientific computing, public health, humanities, mathematics, genomics, computational biology, physics and statistics.

It can be slightly nerve-racking when putting a call out for nominations for editors for a newly re-launched journal – what if no one applies? Thankfully, this wasn’t the case for us, and we received nearly 50 applications, which is a great sign of the feeling in the data science community that this journal is needed and wanted. Many of the applicants I already know through their active engagement in the CODATA and other research data communities, and I am very much looking forward to working with all of the editorial team in the future.

I would also like to take the opportunity to thank the previous members of the Data Science Journal editorial board, in particular the previous Editors-in-Chief, Shuichi Iwata and John Rumble, for their past work.

SarahCallaghanPortrait_2013Introducing myself

I was honoured to be asked to take on the role of Editor-in-Chief of the Data Science Journal earlier this year. My scientific background is in radio propagation, where I created, managed and archived long term, irreproducible datasets, with all the aggravation that goes with that work. I then changed roles and became a data and project manager for the Centre for Environmental Data Analysis (CEDA) at STFC Rutherford Appleton, UK – poacher turning gamekeeper, so to speak.

My main research interests are in data citation and publication. Simply put, I want to change the research culture so that publishing data, and getting credit for it, is the norm rather than the exception. (And yes, I do know how difficult that particular culture change is likely to be.)

In the past I have managed several data citation and publication projects, including the Jisc funded OJIMS and PREPARDE projects, and the NERC Data Citation and Publication project. I was co-chair of the CODATA-ICSTI Task Group on Data Citation (before being co-opted to the CODATA Executive Committee) and am currently a co-chair of the RDA/WDS Working Group on Publishing Data Bibliometrics.

In my day job, I currently project manages several large scale projects including the EU FP7 project CLIPC .  My formal publication list can be found here, and I also blog informally about data topics here.

My aim is to make the Data Science Journal the primary journal for high quality academic publications in data science, providing a focus and discussion space for the wider community. I know that with the support of the editorial team, we will make this happen!