Author Archives: codata_blog

Report on 11th RDA conference: “From data to knowledge

The 11th RDA Plenary took place at Berlin, Germany from 189th to 23rd March 2018. The meeting brought together several and different experts including researchers, data scientists, knowledge experts and practitioners. From the proceeding it clear that a number of stakeholders have and continue to be engaged in advancing a number projects and initiatives in open data space. It is also evidenced that data is driving many economies including those of developing countries.

In the recent past there has been increased demand for data driven empowering techniques such as knowledge on fertilizer applications to help farmers, delivery of real time and mobile data and information. These demands require intelligent systems that can learn, adapt and automatically act on data. In addition, there is increasingly recognition of digital detailed reflection of the physical world we live in where most of the agricultural systems has to be fed with digital information requiring information to be availed in digital format. Furthermore, new and emerging means of collaboration are requiring dynamic connection including people, processes, devices and services. These trends are signs that stakeholders working in the data space should be aware that today’s agriculture performance relies connecting agencies and farmers. These requirements can be achieved through innovative use of data for instance creating educational programs for farmers using open data from research systems on best farming practices.

In this regard, many speakers acknowledged the efforts Research Data Alliance (RDA) members have continued to put in moving the data agenda forward. During the pre-meeting and the main meeting numerous data-driven opportunities were mentioned. As part of the way forward, it was emphasized that development of interfaces between research, industry actors, governments, society and users e.g. farmers can be enhanced through the use of data. Moreover, the need for increased ICT infrastructure to enable practices for collaborative research and timely information access was underscored. The meeting noted explosion of intensive demand for information, innovation and data being turned into insight.

One of the interesting new dimension is the move from experienced based evidence to data-driven based evidence which is as result of increased computer intelligence over human intelligence as shown in the graph below.

While human intelligence depend on experience computer intelligence depends heavily on data.

The pre-meeting agreed on key deliverables that must be undertaken to address some of the challenges facing data management and open data debate. It also formed the action plan for the future and they include:

Capacity Building for researchers, data scientists and information experts on the following areas
- Digitization of data collection
- Rescue of historical data
- Skills on data science
- Techniques for data analysis and visualization
- Copyright & Database rights
- Advocate for open data and information sharing policies
Capacity Building for practitioners and users
- On-farm application
- Interpretation of Good Agricultural Practices (GAP) farming advisories
Development of online platforms including mobile applications
- Set up an open data and information repository for data project

Key players promoting open data principles specifically, GODAN Action, CODATA and FAO agreed to work together through partnership and collaborative frameworks in all data initiatives and projects starting with joint sessions in the upcoming IDW2018 conference. There was also a proposal for RDA education and training working group. GODAN Action also announced the next capacity building program scheduled to start in April 2018. The first program was very successful with participants from all over the globe.

The CODATA Task Group on Agriculture presented their main as Use Cases farmer data and innovation, working on weather data through establishing community of practice (COP) and last mile capacity building. The outcome has been on ICT innovation use cases such as the agro-weather tool and other online data platforms, strengthening of adaptive capacities of users and turning data into insights as summarized below.

Some of the areas the CODATA ATG has made good progress includes: Research Informatics concepts through the developing ICT innovations, tools and systems that turn data into insights. Looking forward there are plans for training on data science, development of BIG Data platform using artificial intelligence, machine learning and data mining. One of the main achievement is the use of ICT in management and application of weather data using knowledge hub portals and mobile applications. These efforts have opened up agricultural research data space particularly downscaling and interpretation of datasets into context and location specific.

Scoping Machine-Actionable DMPs (maDMPs)

Scoping Machine-Actionable DMPs

This blog post was written by Ina Smith, Project Manager: African Open Science Platform, Academy of Science of South Africa (ASSAf), DOAJ Ambassador, Southern Africa Region, LIASA Librarian of the Year 2016

maDMPs are a vehicle for reporting on the intentions and outcomes of a research project that enable information exchange across relevant parties and systems. They contain an inventory of key information about a project and its outputs (not just data), with a change history that stakeholders can query for updated information about the project over its lifetime. The basic framework requires common data models for exchanging information, currently under development in the RDA DMP Common Standards WG, as well as a shared ecosystem of services that send notifications and act on behalf of humans. Other components of the vision include machine-actionable policies, persistent identifiers (PIDs) (e.g., ORCID iDs, funder IDs, forthcoming Org IDs, RRIDs for biomedical resources, protocols.io, IGSNs for geosamples, etc), and the removal of barriers for information sharing.

Read full blog post

Event: National Academies Report Release – Open Science By Design: Realizing a Vision for 21st Century Research, July 17 at 17:30 pm UTC

On July 17th, the National Academies of Sciences, Engineering, and Medicine’s Board on Research Data and Information will host a public release of a consensus study report, Open Science by Design: Realizing a Vision for 21st Century Research.

Wide access to scientific research results has proven to be an important tool for accelerating scientific progress. An ad hoc committee under the Board on Research Data and Information (BRDI) conducted a study on the challenges of broadening access to the results of scientific research, described as “open science.” Ongoing advances in information technologies provide researchers with opportunities to share and access scientific articles, research data, and methodology. Transitioning to more open science should enable increased transparency and reliability, facilitate more effective collaboration, accelerate the pace of discovery, and foster broader and more equitable access to scientific knowledge and the research process. The committee produced a consensus report with findings and recommendations that address these issues, with a focus on solutions that move the research enterprise toward open science.

Remote participation is available via Zoom https://nasem.zoom.us/j/691897124

Register here

View additional details about this event

FOSTER’s draft Open Science training courses are now open for public consultation

FOSTER Plus (Fostering the practical implementation of Open Science in Horizon 2020 and beyond) is a 2-year, EU-funded project, carried out by 11 partners across 6 countries. The primary aim is to contribute to a real and lasting shift in the behaviour of European researchers to ensure that Open Science (OS) becomes the norm. To this end, a key objective for the FOSTER project has been to develop a set of ten training courses targeted towards early career researchers.

The draft courses are now available for public consultation at https://www.fosteropenscience.eu/toolkit. We are seeking feedback from the community on how they could be improved. Please fill in the evaluation form by July 31st and provide as much information as possible to help us build courses that fit your needs. Please note that the course quiz functionality is still in development so you won’t see any feedback or results yet but please feel free to suggest other quiz questions. The form can be accessed at:https://docs.google.com/forms/d/e/1FAIpQLSfjmfA0lqN09Lt8in4o7bW_IVBPEXnR6fzCvpeC0o3Hyvt72g/viewform

We aimed to reuse existing training content wherever possible and have worked closely with our discipline specific partners to ensure that pointers to discipline specific tools and resources have been included. Ten draft courses covering a range of Open Science topics such as open access, research data, open source software, and open peer-review have been developed over first year of the project. To find out more about the course development, you can read a FOSTER blog post https://www.fosteropenscience.eu/node/1953.

For more on the FOSTER Plus project or to access the FOSTER portal, please see https://www.fosteropenscience.eu/.

Now accepting applications for OWSD fellowship for Early Career Women Scientists

OWSD is now officially accepting applications for the new fellowship for Early Career Women Scientists (ECWS). Simply click on ‘Apply now’ to begin your application. You may also save your application at any time to continue working on it later; click ‘Resume’ to continue. Please note that application materials will be accepted only through the online system.

As a reminder, this fellowship is a prestigious award of up to USD 50,000, generously provided by Canada’s IDRC/CRDI, and is offered to women scientists from Science and Technology Lagging Countries (STLCs) who have completed their PhDs in Science, Technology, Engineering and Mathematics (STEM) subjects and are employed at an academic or scientific research institute in one of the eligible countries. ECWS fellows will be supported for two years to continue their research at an international level while based at their home institutes, to build up research groups that will attract international visitors, and to link with industry.
Though applications must be submitted in English, all information about the programme is also available in French on the website.

The full Call for Applications is here. English French

The deadline for completed online applications is 31 August 2018.

Questions may be sent to earlycareer [at] owsd.net

Outcome of the Urban Data Science Summer School – CEPT University, Ahmedabad, India

This post was written by Shaily Gandhi, who is currently pursuing a PhD in Geomatics from CEPT University, India. Shaily attended the CODATA-RDA School of Research Data Science, hosted at ICTP, near Trieste, Italy.

Urban Data Science is a course which is an outcome of the collaboration which took place in CODATA Research Data workshop in Trieste 2017. The course of urban data science was hosted by CEPT University, Ahmedabad, India from May 14 – May 19, 2018 to address the challenges with poor use of available open data in decision making while keeping urban in focus. The course had been designed to get students started with the basic data science components in a short span of 6 days. The aim of the course was to give an insight on open urban Data Sets and insights for interaction with other sources of data freely available. These Data Sets allow a deeper understanding of the urban and its problems, allowing the students to have a firmer control over possible bias and therefore analysing and giving solutions for overcoming thesituations.

The course was carefully designed for students from different backgrounds like planning, architects, civil engineer, geomatics and other disciplines from both bachelors and masters level who belonged to IT and non- IT background. The lessons of the basics of R were prepared by using the material of software carpentry lessons Programming with R and R for Reproducible Scientific Analysis. The concepts were taken and the lessons were redesigned focusing on urban problems and analysis. The school begun with setting a study objective using techniques to develop a research concept, planning area of study, thus bearing in mind the type of data avaible from Open data sets for urban research to be captured, whether continuous, discrete, ordinal or nominal data and the different stages of statistical analysis that can be conducted in other to produce results. Knowledge on research methodologies and implementation of statistical application software’s to support data analysis was one of the vital goals of the course. The Statistical software package called “R” was used as it has become a very powerful and useful tool for the purpose of data cleaning, management, statistical analysis and data graphical visualization. When mastered, this application is user friendly and could reduce the time and efforts of the researcher, student and professionals.

Innovative teaching techniques like mixing theory and practical’s with group work were followed in this course as it had diverse students attending and it required a special attention to keep the whole class in the same pace. Despite the course being intense from morning 9 am till evening 7 pm it was very motivating to see the students following up with the topics and catching up with the pace of the instructor. Daily feedback was taken from the students to enhance class activity decisions by tutors. Course was modified daily with more group activities and practical’s based on students feedbacks received. Continuous constructive comments from the students made it more effective as the tutors were able to achieve the desired output by changing the teaching method according to the requirement of the students. This process of understanding the capability of the students was well appreciated.

The second aim of this course is to transform the traditional teaching techniques into a newer form in which the students have an energetic and innovative involvement to improve the way the course is taught and in the process enhance their proficiencies in solving data driven case studies in practice. By the end of the course it was a great pleasure to receive outputs of the case study which had Data Science for urban studies. Some of the outstanding studies are Traffic Violation in Montgomery County using the data from Public Safety department from the government portal data.montgomerycountymd.gov. Another study on the crime in the city of Chicago was also considered by one of the student where the used of open data was done. Another study was done on the monitoring the trend of border crossing vehicles in USA which showed interesting pattern. Study of air quality for major urban states of US showed interesting pattern stating that majority of the US is affected by medium concentration of PM (Particulate matter). Many more interesting topics were studied which gave a very good inside of the understanding of the students about data science. Students analyzed and interpreted the spatial behavior of the urban data with Geospatial as well as Graph Analysis.

Harnessing the power of the digital revolution: Data- and computation-driven research for Environmental Hazards (Session 191)

Please consider submitting an abstract to Harnessing the power of the digital revolution: Data- and computation-driven research for Environmental Hazards (Session 191), offered under the auspices of SciDataCon 2018, that will take place on 5-8 November 2018in Gaborone, Botswana as part of International Data Week convened by CODATA, the ICSU World Data System and the Research Data Alliance. The deadline to submit an abstract is Monday, 25 June: https://www.scidatacon.org/IDW2018/submit/

Important Dates:

Abstract submission: 25 June 2018

Conference: 5-8 November 2018 in Gaborone, Botswana

Conveners:

Carolynne Hultquist, Guido Cervone, Jenni Evans

The Pennsylvania State University, University Park, PA

Submissions may be made at https://www.scidatacon.org/IDW2018/submit/ – the deadline is 23:59 UTC on Monday 25 June. Please contact the organisers if you intend to submit a paper to this or any other session and it may be possible to allow a slight extension.

Session 191 – Call for Abstracts:

The study and understanding of environmental hazards is of crucial importance for the survival of our society and future generations. A single event can claim thousands of lives, cause billions of dollars of damage to infrastructure, and destroy the environment. Nowadays, the risks are the greatest they have been in human history due to the development of mega-cities, dams, nuclear power plants, and other high-risk facilities. We can face these risks by harnessing environmental data to better understand physical properties of hazards, predict the impacts, assess risk to human interests, respond to hazards, and evacuate effectively.

The digital revolution has changed the face of science to take advantage of information technologies and computation that are now part of everyday life. Environmental hazard data are more accessible than ever for research using technologies such as remote sensing, smart sensors, and models. At the same time, an explosion in the use of social media generated new data streams of information that can provide actionable data during emergency situations. These changes led to the collection of unprecedented massive amounts of data about people and their daily interaction with the world. The generation of data is faster than our ability to analyze them, and this is quickly leading towards a data-rich but knowledge-poor environment.

A major challenge is harnessing relevant environmental data for use in computing applications that increase the societal value of the data and can provide assessment of the direct impact of decisions. Methods for increasing the value of data may involve such topics as large-scale data analysis, data mining, data integration, visualization, data exploration or representation. A particular area of interest is handling the challenges of dealing with real-time data and computation to direct changes in such applications as disasters, smart cities, and smart grids.

This session welcomes interdisciplinary environmental research papers at the frontier of the digital revolution in data science and technologies. In addition to environmental science and natural hazards fields, research relevant to this session are likely to come from fields of geography, meteorology, computing, engineering, health, economics, urban studies, management, policy, etc.

https://www.scidatacon.org/IDW2018/sessions/191/

SciDataCon 2018: The Digital Frontiers of Global Science

SciDataCon 2018 will address the theme of ‘The Digital Frontiers of Global Science’. In a hyperconnected world where the internet is pervasive and web technologies are driving major changes in our lives, research has become more than ever before digital and international. Furthermore, the major societal and scientific challenges facing humanity in this digital age are profoundly global in character, requiring the participation of researchers from all countries and disciplines. The data revolution is also a major source of the scientific opportunities to address these issues but to realize these potentials the frontiers of science, data analysis and stewardship must be advanced. Likewise, the data revolution must be inclusive, benefitting all, and harnessing all energies: no parts of the world and no disciplines should be left behind.

SciDataCon 2018 seeks to explore the digital frontiers of global science by bringing together research and practice papers from a wide range of perspectives. The scope is explicitly broad and inclusive, addressing all aspects of the role of data in research.

The high-level themes of the 2018 edition are:

the digital frontiers of global science;
a global and inclusive data revolution;
applications, progress and challenges of data intensive research;
data infrastructure and enabling practices for international and collaborative research.

An expanded overview of these themes is provided here.

Call for Abstracts – ‘Measuring the Impact of Data Citation Practices in Research’ – SciDataCon, part of International Data Week

The organisers of a session on ‘Measuring the Impact of Data Citation Practices in Research’ at SciDataCon part of International Data Week invite the submission of abstracts.

We invite researchers and organisations that are looking at the impact of data citation to consider contributing to this session.

Session Title: Measuring the Impact of Data Citation Practices in Research

Data citation has been advocated across and within many research enterprises globally. Individual researchers have adopted data citation as part of their work and an increasing number of publishers and funders are now encouraging or requiring some level of data citation. The benefits of data citation are clear: besides increasing the visibility of data resources, improving the integrity of research and publications, there is a general trend of acknowledgment and accreditation being associated with data citation. Researchers are beginning to see the value in the citation of their data to be as important as citation of their other outputs.While the benefits extend beyond reuse and accreditation, there is however little insight into the real impact of data citation. A number of questions have to be addressed; for example, what metrics can be used to measure the impact of data citation and how should impact be measured?

Information about submissions for SciDataCon can be found at Submit Abstracts for Papers and Posters: https://www.scidatacon.org/IDW2018/submit/

For further information contact Anwar Vahed, CSIR, Anwar Vahed <avahed@csir.co.za>

Submit Abstracts for Papers and Posters: https://www.scidatacon.org/IDW2018/submit/

Call for Papers and Posters: https://www.scidatacon.org/conference/IDW2018/call_for_papers/

Provisionally Accepted Sessions: https://www.scidatacon.org/IDW2018/sessions/

Themes and Scope of SciDataCon: https://www.scidatacon.org/conference/IDW2018/conference_themes_and_scope/

International Data Week comprises the next Plenary Meeting of the Research Data Alliance and the SciDataCon conference on all aspects of the role of data in research. It is taking place in Gaborone, Botswana, 5-8 November 2018.

The deadline for abstract submissions is 25 June.

Assessment of Data Management Practices of the Citizen Science and Crowdsourcing Communities: deadline for SciDataCon abstracts on June 25

This is a reminder that the deadline for submitting abstracts for this session on the validation, curation and management of citizen science data at the SciDataCon is on Monday, June 25. They would use people’s help in recruiting papers, especially from African citizen science groups. Given costs, it may be most pragmatic to focus on southern African (or South African) groups, but CS groups or researchers from all over the world are welcome to participate. To submit, go to: https://www.scidatacon.org/IDW2018/sessions/.

Assessment of Data Management Practices of the Citizen Science and Crowdsourcing Communities

Alex de Sherbinin and Anne Bowser

The objectives of the CODATA–WDS Task Group on citizen science data are to better understand the ecosystem of data-generating citizen science, crowdsourcing, and volunteered geographic information (VGI) projects so as to characterize the potential and challenges of these developments for science as a whole, and data science in particular. Through interviews with principals involved in 50 projects, the task group has assessed the methods and approaches for validating various streams of citizen science data, the mechanisms for cleaning and curating the data, and systems in place for the long-term management, documentation and dissemination of those data. This presentation reports on results of this assessment, and provides recommendations to the citizen science / crowdsourcing community on data quality and management practices.

Turning FAIR Data into Reality – Report and Action Plan Consultation until 5 August

The European Commission’s Expert Group on FAIR Data, chaired by Simon Hodson, CODATA Executive Director, published the interim report ‘Turning FAIR Data into Reality’ and the interim ‘FAIR Data Action Plan’ on 11 June 2018 at the Second EOSC Summit in Brussels.

Interim Report and Action Plan

The interim report and Action Plan are available from the Zenodo repository with the DOI-URLs below:

Interim FAIR Data Report: https://doi.org/10.5281/zenodo.1285272
Interim FAIR Data Action Plan: https://doi.org/10.5281/zenodo.1285290

Consultation until 5 August

Consultation is being conducted on the interim report and Action Plan until 5 August 2018. A commentable version of the report is available on Google Drive. Structured comments on the Action Plan and specific recommendations and actions may be made via a dedicated GitHub repository.

Comment on Report: http://bit.ly/interim_FAIR_report
Comment on Action Plan: https://github.com/FAIR-Data-EG/action-plan

The Expert Group will conduct webinars to support and facilitated the consutlation and these will be announced in due course.

About the Expert Group and the Report

Rec. 3: A model for FAIR Data Objects
Implementing FAIR requires a model for FAIR Data Objects which by definition have a PID linked to different types of essential metadata, including provenance and licencing. The use of community standards and sharing of code is also fundamental for interoperability and reuse.

It is recognised that FAIR data (data that are Findable, Accessible, Interoperable and Reusable) play an essential role in the objectives of Open Science to improve and accelerate scientific research, to increase the engagement of society, and to contribute significantly to economic growth. Accordingly, ‘the Open Science agenda contains the ambition to make FAIR data sharing the default for scientific research by 2020.’ The overall objective of the European Commission Expert Group on Turning FAIR data into reality is to help operationalise and facilitate the achievement of this goal.

Rec. 4: Components of a FAIR data ecosystem
The realisation of FAIR data relies on, at minimum, the following essential components: policies, DMPs, identifiers, standards and repositories. There need to be registries cataloguing each component of the ecosystem and automated workflows between them.

To this end, this report that examines the FAIR data principles, considers other supporting concepts and discusses the changes necessary, as well as existing activities and stakeholders to make these interventions. Recommendations and actions are presented as an Action Plan for consideration by the Commission, Member States and leading stakeholders in the research and data communities.

It might have been possible to take a data centric point of view and to work through the FAIR principles slavishly or systematically (depending on your point of view) asking what needs to be done to achieve each one. The Expert Group decided at an early point that this would not be the most effective approach to our task. Rather we felt it was important to take a holistic and systemic approach and to describe the broader range of changes required to achieve FAIR data. It is hoped that what has emerged will be at one and the same time an Action Plan that will be immediately useful and a longer standing survey and discussion, providing a discursive framework for ongoing considerations of how to make FAIR data a reality.

Consultation is open on the interim report and Action Plan and we actively invite constructive feedback. Does the Action Plan highlight the correct priorities? Are the recommendations sound and the actions tangible and achievable? Are they presented in a way that will helpfully guide the stakeholders mentioned? Is the Action Plan sufficiently grounded in the discussions and arguments of the broader report? Given the way this particularly piece of marble has already been cut and carved, what still needs to be done to make a polished statue emerge?

Consultation on the interim report was launched at the EOSC summit on 11 June 2018 and initiated by means of a workshop at that meeting. It will be pursued by online means and by webinars until 5 August. A final version of the Report and Action Plan will be published at the Austrian Presidency event on 23 November.

The group has conducted its work by means of face-to-face and virtual meetings and a lot of asynchronous, collaborative work with the text. All members of the group have contributed substantively and substantially to the text. We hope that we have harnessed the strength and collective wisdom of the Expert Group, while minimising the flaws of group authorship. Our approach has been discursive and we have endeavoured to explore the arguments relating to FAIR in detail to identify the key steps needed for implementation. This is an iterative process and the final version of the report will present a more condensed argument.

The group has been chaired by Simon Hodson, CODATA Executive Director, with Sarah Jones, Associate Director of the Digital Curation Centre, as Rapporteur; but in effect the two have acted as co-chairs.

Membership of the Expert Group

Sandra Collins, National Library of Ireland
Françoise Genova, Observatoire Astronomique de Strasbourg
Natalie Harrower, Digital Repository of Ireland
Simon Hodson, CODATA, Chair of the Group
Sarah Jones, Digital Curation Centre, Rapporteur
Leif Laaksonen, CSC-IT Center for Science
Daniel Mietchen, Data Science Institute, University of Virginia
Rūta Petrauskaité, Vytautas Magnus University
Peter Wittenburg, Max Planck Computing and Data Facility

CODATA Blog

News from the CODATA community and from Simon Hodson, CODATA Executive Director