Category Archives: Data Science Journal

Posts relating to the data science journal

Data Science Journal Special Collection for SciDataCon 2016

Data Science Journal is pleased to announce that it will be publishing the high profile special collection of papers from SciDataCon 2016.

Authors with papers accepted for presentation at SciDataCon are also invited to submit their full papers to the Data Science Journal.  Submissions should be made at

Please note the following:

  • The deadline for submissions to be part of the SciDataCon 2016 special collection is 30 September.
  • Even though abstracts were peer-reviewed and accepted as part of the conference process, the full paper will be peer-reviewed to ensure quality.
  • Given the number of papers expected we are unable to waive the Article Processing Charge (APC) for all papers, however the Data Science Journal is very competitive and has a progressive waiver policy for those unable to pay the APC: Please contact the Editor-in-Chief before submitting your article if you would like to request a waiver. Editorial decisions are made independently from the ability to pay the APC.

SciDataCon 2016 and

Advancing the Frontiers of Data in Research

SciDataCon 2016 seeks to advance the frontiers of data in all areas of research. This means addressing a range of fundamental and urgent issues around the ‘Data Revolution’ and the recent data-driven transformation of research and the responses to these issues in the conduct of research.

SciDataCon 2016 is motivated by the conviction that the most significant contemporary research challenges—and in particular those reaching across traditional disciplines—cannot be properly addressed without paying attention to issues relating to data.  These issues include policy frameworks, data quality and interoperability, long-term stewardship of data, and the research skills, technologies, and infrastructures required by increasingly data-intensive research.  They also include frontier challenges for data science: for example, fundamental research questions relating to data integration, analysis of complex systems and models, epistemology and ethics in relation to Big Data, and so on.

The transformative effect of the data revolution needs to be examined from the perspective of all fields of research and its relationship to broader societal developments and to data-driven innovation scrutinised.  Taken together these issues form a multi-faceted challenge which cannot be tackled without expertise drawn from many disciplines and diverse roles in the research enterprise.  Furthermore, the transformations around data in research are essentially international and the response must be genuinely global.  SciDataCon is the international conference for research into these issues.

SciDataCon2016 will take place on 11-13 September 2016 at the Sheraton Denver Downtown Hotel, Denver, Colorado, USA.  It is part of International Data Week, 11-16 September 2016, convened by CODATA, the ICSU World Data System and the Research Data Alliance.

3rd LEARN Workshop, Helsinki, June 2016

This post is a syndicated copy of the one at and was written by Sarah Callaghan, Editor-in-Chief of the Data Science Journal

Open Data in a Big Data World

The 3rd LEARN (Leaders Activating Research Networks) workshop on Research Data Management, “Make research data management policies work” was held in Helsinki on Tuesday 28th June. I was invited wearing my CODATA hat (as Editor-in-Chief for the Data Science Journal) to give the closing keynote about the Science International Accord “Open Data in a Big Data World“.

The problem with doing closing talks is that so much of what I wanted to say had pretty much already been said by someone during the course of the day – sometimes even by me during the breakout sessions! Still, it was a really interesting workshop, with excellent discussion (despite the pall that Brexit cast over the coffee and lunchtime conversation – but that’s a topic for another time).

There were three breakout session possibilities, of which the timings meant that you could go to two of them.

I started with Group 3: Making possible and encouraging the reuse of data: incentives needed. This is my day job – taking data in from researchers, making it understandable and reusable, and figuring out ways to give them credit and rewards for doing so. And my group has been doing this for more than 2 decades, so I’m afraid I might have gone off on a bit of a rant. Regardless, we covered a lot, though mainly the old chestnuts of the promotion and tenure system being fixated on publications as the main academic output, the requirements for standards (especially for metadata – acknowledging just how difficult it would be to come up with a universal metadata standard applicable to all research data), and the fact that repositories can control (to a certain extent) the technology, but culture change still needs to happen. Though there were some positives on the culture change – I noted that journals are now pushing DOIs for data, and this has had an impact on people coming to us to get DOIs.

Next breakout group I went to was Group 1: Research Data services planning, implementation and governance. What surprised me in this session (maybe it shouldn’t have) was just how far advanced the UK is when it comes to research data management policies and the likes, in comparison to other countries. This did mean that me and my other UK colleagues did get quizzed a fair bit about our experiences, which made sense. I had a bit of a different perspective from most of the other attendees – being a discipline-specific repository means that we can pick and choose what data we take in, unlike institutional repositories, who have to be more general. On being asked about what other services we provide, I did manage to name-drop JASMIN, in the context of a UK infrastructure for data analysis and storage.

I think the key driver in the UK for getting research data management policies working was the Research Councils, and their policies, but also their willingness to stump up the cash to fund the work. A big push on institutional repositories was EPSRC’s putting the onus on research institutions to manage EPSRC-funded research data. But the increasing importance of data, and people’s increased interest in it, is coming from a wide range of drivers – funders, policies, journals, repositories, etc.

I understand that the talks and notes from the breakouts will be put up on the workshop website, but they’re not up as of the time of me writing this. You can find the slides from my talk here.

Call for Papers – Data Science Journal

The Data Science Journal is a peer-reviewed, open access, electronic journal dedicated to the advancement of data science and its application in policies, practices and management of Open Data.

We are currently soliciting submissions for papers on a wide range of data science topics, across the whole range of computational, natural and social science, and the humanities. The scope of the journal includes descriptions of data systems, their implementations and their publication, applications, infrastructures, software, legal, reproducibility and transparency issues, the availability and usability of complex datasets, and with a particular focus on the principles, policies and practices for data.

All data is in scope, whether born digital or converted from other sources, and all research disciplines are covered. Data is a cross-domain, cross-discipline topic, with common issues, regardless of the domain it serves. The Data Science Journal publishes a variety of article types (research papers, practice papers, review articles and essays). The Data Science Journal also publishes data articles, describing datasets or data compilations, if the potential for reuse of the data is significant or if considerable efforts were required in compilation. Similarly, the Data Science Journal also publishes descriptions of online simulation, database, and other experiments, partnering with digital repositories on ‘meta articles’ or ‘overlay articles’, which link to and allow visualisation of the data, thereby adding an entirely new dimension to the communication and exchange of data research results and educational materials.

For further information, and to submit a manuscript, please visit

Introducing the new Data Science Journal Editorial Board

dsj_coverThis post comes from Sarah Callaghan, new Editor-in-Chief of the Data Science Journal, recently relaunched with Ubiquity Press.

It is my great pleasure to be able to introduce the new editorial team for the Data Science Journal. We have gathered an exceptional team, with members from all around the world, covering data science topics as diverse as data stewardship, databases, large scale data facilities, data visualisation, geospatial aspects of data, semantics, data policy and much, much more. Our editorial board members also bring expertise in research fields such as (but not limited to) Earth sciences, libraries, scientific computing, public health, humanities, mathematics, genomics, computational biology, physics and statistics.

It can be slightly nerve-racking when putting a call out for nominations for editors for a newly re-launched journal – what if no one applies? Thankfully, this wasn’t the case for us, and we received nearly 50 applications, which is a great sign of the feeling in the data science community that this journal is needed and wanted. Many of the applicants I already know through their active engagement in the CODATA and other research data communities, and I am very much looking forward to working with all of the editorial team in the future.

I would also like to take the opportunity to thank the previous members of the Data Science Journal editorial board, in particular the previous Editors-in-Chief, Shuichi Iwata and John Rumble, for their past work.

SarahCallaghanPortrait_2013Introducing myself

I was honoured to be asked to take on the role of Editor-in-Chief of the Data Science Journal earlier this year. My scientific background is in radio propagation, where I created, managed and archived long term, irreproducible datasets, with all the aggravation that goes with that work. I then changed roles and became a data and project manager for the Centre for Environmental Data Analysis (CEDA) at STFC Rutherford Appleton, UK – poacher turning gamekeeper, so to speak.

My main research interests are in data citation and publication. Simply put, I want to change the research culture so that publishing data, and getting credit for it, is the norm rather than the exception. (And yes, I do know how difficult that particular culture change is likely to be.)

In the past I have managed several data citation and publication projects, including the Jisc funded OJIMS and PREPARDE projects, and the NERC Data Citation and Publication project. I was co-chair of the CODATA-ICSTI Task Group on Data Citation (before being co-opted to the CODATA Executive Committee) and am currently a co-chair of the RDA/WDS Working Group on Publishing Data Bibliometrics.

In my day job, I currently project manages several large scale projects including the EU FP7 project CLIPC .  My formal publication list can be found here, and I also blog informally about data topics here.

My aim is to make the Data Science Journal the primary journal for high quality academic publications in data science, providing a focus and discussion space for the wider community. I know that with the support of the editorial team, we will make this happen!