Category Archives: Data Science Journal

Posts relating to the data science journal

September 2019: Publications in the Data Science Journal

September 2019:  Publications in the Data Science Journal

Title: Data Sharing at Scale: A Heuristic for Affirming Data Cultures
Author
: Lindsay Poirier, Brandon Costelloe-Kuehn
URL: 
http://doi.org/10.5334/dsj-2019-048
Title: Building Infrastructure for African Human Genomic Data Management
Author:Ziyaad Parker , Suresh Maslamoney, Ayton Meintjes, Gerrit Botha, Sumir Panji, Scott Hazelhurst, Nicola Mulder
URL: http://doi.org/10.5334/dsj-2019-047
Title: Analysis of Rainfall and Temperature Data Using Ensemble Empirical Mode Decomposition
Author
: Willard Zvarevashe, Symala Krishnannair, Venkataraman Sivakumar
URL: http://doi.org/10.5334/dsj-2019-046
Title: Policy Needs to Go Hand in Hand with Practice: The Learning and Listening Approach to Data Management
Author
: Maria Cruz, Nicolas Dintzner, Alastair Dunning, Annemiek van der Kuil, Esther Plomp, Marta Teperek, Yasemin Turkyilmaz-van der Velden, Anke Versteeg
URL: 
http://doi.org/10.5334/dsj-2019-045
Title: The Australian Research Data Commons 
Author
: Michelle Barker, Ross Wilkinson, Andrew Treloar
URL: 
http://doi.org/10.5334/dsj-2019-044
Title: The Impact of Targeted Data Management Training for Field Research Projects – A Case Study
Author
: Jonathan L. Petters , George C. Brooks, Jennifer A. Smith, Carola A. Haas
URL: 
http://doi.org/10.5334/dsj-2019-043

July 2019: Publications in the Data Science Journal

July 2019:  Publications in the Data Science Journal

Title: Real Estate Evaluation Model Based on Genetic Algorithm Optimized Neural Network
Author
: Yan Sun
URL: 
http://doi.org/10.5334/dsj-2019-036
Title: Abnormal Pattern Prediction: Detecting Fraudulent Insurance Property Claims with Semi-Supervised Machine-Learning
Author: Sebastián M. Palacio
URL: http://doi.org/10.5334/dsj-2019-035
Title: A Regional Project in Support of the SADC Cyber-Infrastructure Framework Implementation: Weather and Climate
Author
: Mary-Jane Morongwa Bopape , Happy Marumo Sithole, Tshiamo Motshegwa, Edward Rakate, Francois Engelbrecht, Emma Archer, Anneline Morgan, Lwando Ndimeni, Joel Botai
URL: http://doi.org/10.5334/dsj-2019-034
Title: Designing Transnational Hydroclimatological Observation Networks and Data Sharing Policies in West Africa
Author
: Seyni Salack , Aymar Bossa, Jan Bliefernicht, Sina Berger, Yacouba Yira, Kamil A. Sanoussi, Samuel Guug, Dominicus Heinzeller, Adolphe S. Avocanh, Barro Hamadou, Symphorien Meda, Belko A. Diallo, Igor B. Bado, Inoussa A. Saley, Elidaa K. Daku, Namo Z. Lawson, Aida Ganaba, Safiétou Sanfo, Koufanou Hien, Arone Aduna, Gero Steup, Bernd Diekkrüger, Moussa Waongo, Antonio Rogmann, Ralf Kunkel, John P. A. Lamers, Mouhamadou B. Sylla, Harald Kunstmann, Boubacar Barry, Laurent G. Sedogo, Christian Jaminon, Paul Vlek, Jimmy Adegoke, Moumini Savadogo
URL: 
http://doi.org/10.5334/dsj-2019-033
Title: An Automated Machine Learning Based Decision Support System to Predict Hotel Booking Cancellations
Author
: Nuno Antonio, Ana de Almeida, Luis Nunes
URL: 
http://doi.org/10.5334/dsj-2019-032
Title: Indigenous Data Governance: Strategies from United States Native Nations 
Author
: Stephanie Russo Carroll, Desi Rodriguez-Lonebear, Andrew Martinez
URL: 
http://doi.org/10.5334/dsj-2019-031
Title: Building Open Access to Research (OAR) Data Infrastructure at NIST
Author
: Gretchen Greene , Raymond Plante, Robert Hanisch
URL: 
http://doi.org/10.5334/dsj-2019-030
Title: The Landscape of Rights and Licensing Initiatives for Data Sharing
Author
: Sam Grabus, Jane Greenberg
URL: 
http://doi.org/10.5334/dsj-2019-029
Title: Data Sharing Practices among Researchers at South African Universities
Author
: Siviwe Bangani, Mathew Moyo
URL: 
http://doi.org/10.5334/dsj-2019-028

Publishing an article in CODATA Data Science Journal

This article was first published by Ms. Neema Mduma https://neylicious.github.io/ml/2019/05/11/paper.html – Neema is an alumni of the CODATA-RDA School of Research Data Science.

In early 2017, I was privileged to work as a researcher in the Dropwall project (by Rose Funja) which was among the winning project of the Data for Local Impact Innovation Challenge (DLIIC). The main focus of the project was to develop a tool that will help fighting dropout among secondary school girls. The findings from this project show a high rate of dropout among secondary school students particularly girls, and coincide with reports from other studies which show that school dropout is a big challenge in developing countries. On addressing this problem, machine learning techniques has gained much attention in recent years. However, most of the work has been carried out in developed countries, there are only a handful of studies conducted in developing countries on school dropout using machine learning techniques with the consideration of local context and data imbalance problem. This motivated me to continue working (in my PhD) on school dropout using machine learning.

In August 2018, I attended a CODATA-RDA Research Data Science Summer School which was held at the Abdus Salam International Centre of Theoretical Physics (ICTP) in Trieste, Italy. The aim was on building competence in data analysis and security for participants from all disciplines and backgrounds from Sciences to Humanities. The level of engagements and interactions between participants and instructors was outstanding. We were introduced to various opportunities (by The Executive Director of CODATA, Dr. Simon Hodson) such as CODATA Data Science Journal where I later managed to publish the breathtaking findings from the Dropwall project titled A Survey of Machine Learning Approaches and Techniques for Student Dropout Prediction.

June 2019: Publications in the Data Science Journal

June 2019:  Publications in the Data Science Journal

Title: Developing a Model Guidelines Addressing Legal Impediments to Open Access to Publicly Funded Research Data in Malaysia
Author
: Haswira Nor Mohamad Hashim
URL: 
http://doi.org/10.5334/dsj-2019-027
Title: Proposed Guideline for Minimum Information Stroke Research and Clinical Data Reporting
Author:Judit Kumuthini, Lyndon Zass, Melek Chaouch, Michael Thompson, Paul Olowoyo, Mamana Mbiyavanga, Faniyan Moyinoluwalogo, Gordon Wells, Victornia Nembeware, Nicola J. Mulder, Mayowa Owolabi,
URL: http://doi.org/10.5334/dsj-2019-026
Title: A Column Styled Composable Schema Matcher for Semantic Data-Types
Author:  Xiaofeng Liao, Jordy Bottelier, Zhiming Zhao
URL: http://doi.org/10.5334/dsj-2019-025
Title: Importance and Incorporation of User Feedback in Earth Science Data Stewardship
Author
: Hampapuram Ramapriyan, Jeanne Behnke
URL: 
http://doi.org/10.5334/dsj-2019-024
Title: Establishing, Developing, and Sustaining a Community of Data Champions
Author
: James L. Savage, Lauren Cadwallader
URL: 
http://doi.org/10.5334/dsj-2019-023
Title: The Definition of Reuse
Author
: Stephanie van de Sandt, Sünje Dallmeier-Tiessen, Artemis Lavasa, Vivien Petras
URL: 
http://doi.org/10.5334/dsj-2019-022
Title: Geoscientists’ Perspectives on Cyberinfrastructure Needs: A Collection of User Scenarios
Author
: Karen I. Stocks, Sam Schramski, Arika Virapongse, Lisa Kempler
URL: 
http://doi.org/10.5334/dsj-2019-021
Title: Data Distribution Centre Support for the IPCC Sixth Assessment
Author
: Martina Stockhause, Martin Juckes, Robert Chen, Wilfran Moufouma Okia, Anna Pirani, Tim Waterfield, Xiaoshi Xing, Rorie Edmunds
URL: 
http://doi.org/10.5334/dsj-2019-020

May 2019: Publications in the Data Science Journal

May 2019:  Publications in the Data Science Journal

Title: Interdisciplinary Comparison of Scientific Impact of Publications Using the Citation-Ratio
Author: Arthur R. Bos, Sandrine Nitza
URL: http://doi.org/10.5334/dsj-2019-019
Title: Diversity of Woody Species in Djamde Wildlife Reserve, Northern Togo, West Africa
Author:Tchagou Awitazi, Raoufou Radji, Kotchikpa Okoumassou
URL: http://doi.org/10.5334/dsj-2019-018
Title: A Generic Research Data Infrastructure for Long Tail Research Data Management
Author: Atif Latif, Fidan Limani, Klaus Tochtermann
URL: http://doi.org/10.5334/dsj-2019-017
Title: Time Series Prediction Model of Grey Wolf Optimized Echo State Network
Author: Huiqing Wang, Yingying Bai, Chun Li, Zhirong Guo, Jianhui Zhang
URL: http://doi.org/10.5334/dsj-2019-016
Title: Fostering Data Sharing in Multidisciplinary Research Communities: A Case Study in the Geospatial Domain
Author: Martina Zilioli, Simone Lanucara, Alessandro Oggioni, Cristiano Fugazza, Paola Carrara
URL: http://doi.org/10.5334/dsj-2019-015

April 2019: Publications in the Data Science Journal

April 2019:  Publications in the Data Science Journal

Title: A Survey of Machine Learning Approaches and Techniques for Student Dropout Prediction
Author: Neema Mduma, Khamisi Kalegele, Dina Machuve
URL: http://doi.org/10.5334/dsj-2019-014
Title: GeoSimMR: A MapReduce Algorithm for Detecting Communities based on Distance and Interest in Social Networks
Author: Zaher Al Aghbari, Mohammed Bahutair, Ibrahim Kamel
URL: http://doi.org/10.5334/dsj-2019-013
Title: Building an International Consensus on Multi-Disciplinary Metadata Standards: A CODATA Case History in Nanotechnology
Author: John Rumble, John Broome, Simon Hodson
URL: http://doi.org/10.5334/dsj-2019-012

CODATA is pleased to announce Mark Parsons as the new Editor-in-Chief of the Data Science Journal

In his blog post, Mark writes: ‘I am especially interested in helping DSJ build its niche as an influential journal of the ‘science of data’ in the sense that CODATA described it decades ago. We need more fora that encourage dialog across research and practice to understand all the issues around the socio-technical work necessary for data to be findable, accessible, interoperable, reusable, ethical, secure, etc.’ …

‘I have been a member of the DSJ editorial board since the journal moved to Ubiquity Press, and I have been impressed at how Sarah Callaghan and other editors have worked to increase the journal’s quality. I want to continue this momentum. I want to further bolster the review quality and also raise the possibility of open reviews. The nature of DSJ is that it often attracts submissions and requires reviews from practitioners who have much less of a mandate to publish than researchers. I believe practitioners should be encouraged to contribute (with research as well as practice papers), so we should do what we can to recognize and model excellent contributions in this area. …

‘Thanks to Sarah’s great work, DSJ has a bright future as submissions continue to increase in number and quality. DSJ was ahead of its time when it was founded in the 1990s. I am eager to explore how it can continue to push important conversations forward. I welcome all your ideas. Please tell me what you think. Better yet, tell the community through a submission to DSJ!

Read more at http://codata.org/blog/2019/04/29/mark-parsons-joins-codata-as-editor-in-chief-data-science-journal/

Mark replaces Sarah Callaghan, who has served since 2015, when the Data Science Journal was moved to its current platform with Ubiquity Press.

Sarah writes:

‘In my four year tenure, I am very proud of the fact that 135 papers have been published, along with 6 Special Collections with another 5 Special Collections in the pipeline. The journal has grown more popular and is steadily publishing research that is more impactful as time goes on, and this is a testament to the hard work of all involved – including our reviewers and authors.

‘It is time for me to hand over the role of EiC to another, and it is with no small amount of sadness that I do so. Being EiC has been incredibly rewarding (and occasionally infuriating) and I have learned a great deal from it. I am very pleased to know that Mark Parsons is taking over the role, and know that the journal will be in safe, knowledgeable hands.

‘It only remains for me to say my farewells and thank yous. Thank you to the authors, without whom there would be no articles to publish. A thousand thank yous to all my editors, reviewers, colleagues and friends – your efforts on behalf of the journal are deeply, deeply appreciated, as is your wisdom and expertise. I wish you all the very best for the future, and look forward to reading more excellent papers published in the DSJ!’

Read more at http://codata.org/blog/2019/04/29/so-long-and-thanks-for-all-the-fish-a-farewell-from-outgoing-data-science-journal-editor-in-chief-sarah-callaghan/

Growing the Conversation on the Science of Data

Image CC-BY-NC Laura Molloy @LM_HATII from the art intervention series ‘Humans of Data’ http://codata.org/blog/category/humans-of-data/

Mark Parsons joins CODATA as Editor-in-Chief, Data Science Journal

I am honored and excited to take on the role of Editor in Chief for the Data Science Journal.

I have had a bit of history with DSJ. One of my earliest peer-reviewed papers was published with Ruth Duerr in DSJ (Parsons and Duerr 2005). I vividly remember hurrying to make revisions in Costa Rica before heading offline for several weeks. I’d still like to meet one of the reviewers (perhaps I have) who made really helpful comments on how to organize and present the paper to get my points across in a more rigorous and impactful way. I was a data practitioner, not a researcher, and was largely unschooled in formal scientific writing. The guidance was most valuable, and the paper still gets cited now and again.

Years later, I and Peter Fox published what was one one of my most controversial and influential papers (Parsons and Fox 2013). This time, DSJ allowed me to publish after an unconventional public review process involving reams of open review comments from more than two-dozen people.

In short, DSJ has been a catalyst for my career. So I am eager to help foster the journal’s growth and influence and maybe help a few more data scientists along their way.

I am especially interested in helping DSJ build its niche as an influential journal of the ‘science of data’ in the sense that CODATA described it decades ago. We need more fora that encourage dialog across research and practice to understand all the issues around the socio-technical work necessary for data to be findable, accessible, interoperable, reusable, ethical, secure, etc.

I have been a member of the DSJ editorial board since the journal moved to Ubiquity Press, and I have been impressed at how Sarah Callaghan and other editors have worked to increase the journal’s quality. I want to continue this momentum. I want to further bolster the review quality and also raise the possibility of open reviews. The nature of DSJ is that it often attracts submissions and requires reviews from practitioners who have much less of a mandate to publish than researchers. I believe practitioners should be encouraged to contribute (with research as well as practice papers), so we should do what we can to recognize and model excellent contributions in this area.

While improving the content of DSJ, we should also continue to modernize its presentation. We need to actively consider machine-readable papers and content negotiation for both the papers and the metadata. Much like at its founding, DSJ needs to advance the whole concept of scholarly communication.

Thanks to Sarah’s great work, DSJ has a bright future as submissions continue to increase in number and quality. DSJ was ahead of its time when it was founded in the 1990s. I am eager to explore how it can continue to push important conversations forward. I welcome all your ideas. Please tell me what you think. Better yet, tell the community through a submission to DSJ!

So Long, and Thanks for All the Fish’: a farewell from outgoing Data Science Journal Editor-in-Chief, Sarah Callaghan

Back in early 2015, I was approached at a coffee break at a conference, and invited to take on the role of Editor-in-Chief of the Data Science Journal. This was a little bit of a surprise, I will confess, as my previous academic journal experience had been as an associate editor, along with some projects working on data citation and data publishing. The opportunity was too good to resist, however, and with the support of my employer CEDA  I was very pleased to take on the role.

My tenure as EiC also coincided with the move of the journal to its current platform on Ubiquity Press, and came with it the need to appoint a new editorial board, develop a new scope and guidance, collate a new reviewer database, and the other minutiae of re-launching an academic journal. All these things were achieved with the help of my colleagues in the editorial board and section editors, along with the help and support of the Ubiquity Press staff and the CODATA Executive Committee.

In my four year tenure, I am very proud of the fact that 135 papers have been published, along with 6 Special Collections with another 5 Special Collections in the pipeline. The journal has grown more popular and is steadily publishing research that is more impactful as time goes on [https://www.scimagojr.com/journalsearch.php?q=4700152809&tip=sid], and this is a testament to the hard work of all involved – including our reviewers and authors.

It is time for me to hand over the role of EiC to another, and it is with no small amount of sadness that I do so. Being EiC has been incredibly rewarding (and occasionally infuriating) and I have learned a great deal from it. I am very pleased to know that Mark Parsons is taking over the role, and know that the journal will be in safe, knowledgeable hands.

It only remains for me to say my farewells and thank yous. Thank you to the authors, without whom there would be no articles to publish. A thousand thank yous to all my editors, reviewers, colleagues and friends – your efforts on behalf of the journal are deeply, deeply appreciated, as is your wisdom and expertise. I wish you all the very best for the future, and look forward to reading more excellent papers published in the DSJ!

Sarah

February – March, 2019 Publications in the Data Science Journal and new Special Collections

February-March 2019:  Publications in the Data Science Journal and new Special Collections

Title: Research of LOB Data Compression and Read-Write Efficiency in Oracle Database
Author: Jianjun WangYingang Zhao, Gaochuan Liu
URL: http://doi.org/10.5334/dsj-2019-008
Title: Bringing Citations and Usage Metrics Together to Make Data Count
Author: Helena Cousijn, Patricia FeeneyDaniella LowenbergEleonora PresaniNatasha Simons
URL: http://doi.org/10.5334/dsj-2019-009
Title: The Time Efficiency Gain in Sharing and Reuse of Research Data
Author: Tessa E. Pronk
URL: http://doi.org/10.5334/dsj-2019-010
Title: Intelligent Infrastructure, Ubiquitous Mobility, and Smart Libraries – Innovate for the Future
Author:
  Yi Shen
URL: http://doi.org/10.5334/dsj-2019-011

Call for Nominations and Applications: Editor-in-Chief, Data Science Journal, Deadline 14 April

The Data Science Journal is currently accepting nominations and applications to become the Editor-in-Chief of the journal: https://datascience.codata.org/

Applications can be made through the Google form at https://goo.gl/forms/ey60x1N2jO9YM1rY2

The deadline for applications is 12 midnight GMT on Sun 14 April. Read More

Articles are appearing in two new Special Collections in the Data Science Journal.

Göttingen-CODATA RDM Symposium 2018

This special collection contains selected papers from the Göttingen-CODATA RDM Symposium 2018: the critical role of university RDM infrastructure in transforming data to knowledge: https://datascience.codata.org/collections/special/gottingen-codata-rdm-symposium/

Guest editors:
  • Simon Hodson
  • Jan Brase
  • Michael Witt
  • Liz Lyon
  • Devika P. Madalli

Research Data Alliance Results

This collection contains papers documenting research results and outcomes stemming from the Research Data Alliance (RDA) community and efforts: https://datascience.codata.org/collections/special/research-data-alliance-results/

Guest editors:

  • Leonardo Candela, Istituto di Scienza e Tecnologie dell’Informazione “A. Faedo”, Italian National Research Council, Pisa, Italy
  • Donatella Castelli, Istituto di Scienza e Tecnologie dell’Informazione “A. Faedo”, Italian National Research Council, Pisa, Italy
  • Emma Lazzeri, Istituto di Scienza e Tecnologie dell’Informazione “A. Faedo”, Italian National Research Council, Pisa, Italy
  • Paolo Manghi, Istituto di Scienza e Tecnologie dell’Informazione “A. Faedo”, Italian National Research Council, Pisa, Italy

January Publications in the Data Science Journal and new Special Collections

January Publications in the Data Science Journal and new Special Collections
Articles are appearing in two new Special Collections in the Data Science Journal.

Göttingen-CODATA RDM Symposium 2018

This special collection contains selected papers from the Göttingen-CODATA RDM Symposium 2018: the critical role of university RDM infrastructure in transforming data to knowledge: https://datascience.codata.org/collections/special/gottingen-codata-rdm-symposium/

Guest editors:
  • Simon Hodson
  • Jan Brase
  • Michael Witt
  • Liz Lyon
  • Devika P. Madalli

Research Data Alliance Results

This collection contains papers documenting research results and outcomes stemming from the Research Data Alliance (RDA) community and efforts: https://datascience.codata.org/collections/special/research-data-alliance-results/

Guest editors:

  • Leonardo Candela, Istituto di Scienza e Tecnologie dell’Informazione “A. Faedo”, Italian National Research Council, Pisa, Italy
  • Donatella Castelli, Istituto di Scienza e Tecnologie dell’Informazione “A. Faedo”, Italian National Research Council, Pisa, Italy
  • Emma Lazzeri, Istituto di Scienza e Tecnologie dell’Informazione “A. Faedo”, Italian National Research Council, Pisa, Italy
  • Paolo Manghi, Istituto di Scienza e Tecnologie dell’Informazione “A. Faedo”, Italian National Research Council, Pisa, Italy

Articles published in January 2019

Title: Text Mining and Data Information Analysis for Network Public Opinion
Author: Yan Hu
URL: http://doi.org/10.5334/dsj-2019-007
Title: Expanding the Research Data Management Service Portfolio at Bielefeld University According to the Three-pillar Principle Towards Data FAIRness
Author: Jochen Schirrwagen, Philipp Cimiano, Vidya Ayer, Christian Pietsch, Cord Wiljes, Johanna Vompras, Dirk Pieper
URL: http://doi.org/10.5334/dsj-2019-006
Title: Supporting the Interdisciplinary, Long-Term Research Project ‘Patterns in Soil-Vegetation-Atmosphere-Systems’ by Data Management Services
Author: Constanze Curdt
URL: http://doi.org/10.5334/dsj-2019-005
Title: Implementing in the VAMDC the New Paradigms for Data Citation from the Research Data Alliance
Author:
Carlo Maria Zwölf, Nicolas Moreau, Yaye-Awa Ba, Marie-Lise Dubernet
URL: http://doi.org/10.5334/dsj-2019-004
Title: Data Discovery Paradigms: User Requirements and Recommendations for Data Repositories
Author: Mingfang Wu, Fotis Psomopoulos, Siri Jodha Khalsa, Anita de Waard
URL: http://doi.org/10.5334/dsj-2019-003
Title: Additions to the Last Millennium Reanalysis Multi-Proxy Database
Author: David M. Anderson, Robert Tardif, Kaleb Horlick, Michael P. Erb, Gregory J. Hakim, David Noone, Walter A. Perkins, Eric Steig
URL: http://doi.org/10.5334/dsj-2019-002
Title: Understanding Human Mobility Patterns in a Developing Country Using Mobile Phone Data
Author: Merkebe Getachew Demissie, Santi Phithakkitnukoon, Lina Kattan, Ali Farhan
URL: http://doi.org/10.5334/dsj-2019-001