Author Archives: codata_blog

DAMDID 2018 Workshop on FAIR Data and EOSC Announcement

The acronyms FAIR Data and EOSC, emerging within the Europe Union, have achieved a global awareness. Now it is time to invoke a global discussion and to integrate views and experience beyond disciplinary and geographic boundaries. The FAIR principles on data require data to be Findable, Accessible, Interoperable and Re-usable. While their seem to be practicable solutions to make digital objects (files, cloud objects, structures in SQL or No-SQL databases, spreadsheets, etc.) findable and accessible by assigning persistent identifiers and rich metadata to them, ways to increase interoperability and improving re-usability are not as evident. Different data organisations, data models and semantic spaces need to be explicitly described and mapped which is known to be very time consuming at this moment in particular since the data universe is changing rapidly. Conditions for re-usage need to be made explicit which are also difficult to formulate. And, all specifications need to be available in machine actionable forms to enable automatic processing which is the only way to scale up processing in data rich science in the future. A new culture of exchanging data and new methods are required to make FAIR data reality.

In addition, the European Commission started the European Open Science Cloud (EOSC) initiative to develop the eco-system of infrastructures that will help to realise a FAIR data domain for data intensive sciences. It needs to include salient infrastructure components and integrate different services as they are being developed by research disciplines, by existing infrastructure initiatives as well as by computer science driven initiatives in a highly interoperable manner. Obviously, EOSC is describing a process that is agile enough to adapt to the dynamic changes in our data universe and not a blueprint for a top-down designed infrastructure.

This workshop will bring a number of different key experts who were actively leading the agendas with respect to FAIR and EOSC and who come from different research disciplines including computer science with their views and expectations about FAIR and EOSC. In particular, ways to achieve a higher degree of interoperability and re-usability will be discussed. The workshop will also be a moment where experts from different regions will take the chance to discuss state and future of the FAIR action plan and the EOSC initiative, bring in their views and enrichments, and discuss active participation.

Therefore, this workshop will be a milestone not only for creating more awareness about FAIR and EOSC, but also in refining the current plans. In particular, we hope to get clarification of what Open Science means for he countries outside of the EU as well as Perspectives of the EC FAIR/EOSC initiative for involvement of all interested Stakeholders beyond EU.

For further information on FAIR and EOSC we refer to the following documents:

Venue

Lomonosov Moscow State University
October 9, 2018

Organizers

Peter Wittenburg, Leonid Kalinichenko

Program

Tentative version of the workshop program is available.

Data + Technology can improve African Agriculture

Many smallholder farmers start off and manage their agribusiness ventures through trial and error basis. They hit the ground with gusto with no concrete information on farming, which leads to costly blunders. But hopefully that problem is being solved. KALRO (Kenya Agricultural & Livestock Research Organisation) has launched 14 agribusiness apps to help farmers make informed choices even as they undertake their agribusiness ventures.

Launched during the just-concluded East Africa Farmers Digital conference, KALRO’s ICT Director Boniface Akuku said the organisation has realised most farmers run their projects without expert knowledge. To address that problem, the research body is on a mission to digitise the agricultural value chain.

“Ours is to ensure we give farmers research based information to ensure they run their projects with success. Have you noticed that most large scale farmers are doing well but small scale ones are struggling. The reason being while the big players are operating using data, the small holder farmers are operating blindly because they have no access to key information on how to farm.”

The organsation therefore hopes to solve that problem and ensure farmers have access to key information through ICT tools, said Akuku. The apps give step by step information on how to manage avocado, banana, garlic, spider flower and cassava farm. It touches on critical aspects from planting to harvesting and marketing. The other apps address fall armyworm reporting and mapping, grey leaf spot disease resistant maize varieties and maize lethal necrosis disease control.

Read more at:

Report on 11th RDA conference: “From data to knowledge

The  11th RDA Plenary  took place at Berlin, Germany from 189th to 23rd March 2018. The meeting brought together several and different experts including researchers, data scientists, knowledge experts and practitioners. From the proceeding it clear that a number of stakeholders have and continue to be engaged in advancing a number projects and initiatives in open data space. It is also evidenced that data is driving many economies including those of developing countries.

In the recent past there has been increased demand for data driven empowering techniques such as knowledge on fertilizer applications to help farmers, delivery of real time and mobile data and information. These demands require intelligent systems that can learn, adapt and automatically act on data. In addition, there is increasingly recognition of digital detailed reflection of the physical world we live in where most of the agricultural systems has to be fed with digital information requiring information to be availed in digital format. Furthermore, new and emerging means of collaboration are requiring dynamic connection including people, processes, devices and services. These trends are signs that stakeholders working in the data space should be aware that today’s agriculture performance relies connecting agencies and farmers. These requirements can be achieved through innovative use of data for instance creating educational programs for farmers using open data from research systems on best farming practices.  

In this regard, many speakers acknowledged the efforts Research Data Alliance (RDA) members have continued to put in moving the data agenda forward. During the pre-meeting and the main meeting numerous data-driven opportunities were mentioned. As part of the way forward, it was emphasized that development of interfaces between research, industry actors, governments, society and users e.g. farmers can be enhanced through the use of data. Moreover, the need for increased ICT infrastructure to enable practices for collaborative research and timely information access was underscored. The meeting noted explosion of intensive demand for information, innovation and data being turned into insight.

One of the interesting new dimension is the move from experienced based evidence to data-driven based evidence which is as result of increased computer intelligence over human intelligence as shown in the graph below.

While human intelligence depend on experience computer intelligence depends heavily on data.

The pre-meeting agreed on key deliverables that must be undertaken to address some of the challenges facing data management and open data debate. It also formed the action plan for the future and they include:

  1. Capacity Building for researchers, data scientists and information experts on the following areas
    • Digitization of data collection
    • Rescue of historical data
    • Skills on data science
    • Techniques for data analysis and visualization
    • Copyright & Database rights
    • Advocate for open data and information sharing policies
  2. Capacity Building for practitioners and users
    • On-farm application
    • Interpretation of Good Agricultural Practices (GAP) farming advisories
  3. Development of online platforms including mobile applications
    • Set up an open data and information repository for data project

Key players promoting open data principles specifically, GODAN Action, CODATA and FAO agreed to work together through partnership and collaborative frameworks in all data initiatives and projects starting with joint sessions in the upcoming IDW2018 conference. There was also a proposal for RDA education and training working group. GODAN Action also announced the next capacity building program scheduled to start in April 2018. The first program was very successful with participants from all over the globe.

The CODATA Task Group on Agriculture presented their main as Use Cases farmer data and innovation, working on weather data through establishing community of practice (COP) and last mile capacity building.  The outcome has been on ICT innovation use cases such as the agro-weather tool and other online data platforms, strengthening of adaptive capacities of users and turning data into insights as summarized below.

Some of the areas the CODATA ATG has made good progress includes: Research Informatics concepts through the developing ICT innovations, tools and systems that turn data into insights. Looking forward there are plans for training on data science, development of BIG Data platform using artificial intelligence, machine learning and data mining. One of the main achievement is the use of ICT in management and application of weather data using knowledge hub portals and mobile applications. These efforts have opened up agricultural research data space particularly downscaling and interpretation of datasets into context and location specific.

Scoping Machine-Actionable DMPs (maDMPs)

Scoping Machine-Actionable DMPs

This blog post was written by Ina Smith, Project Manager: African Open Science Platform, Academy of Science of South Africa (ASSAf), DOAJ Ambassador, Southern Africa Region, LIASA Librarian of the Year 2016

maDMPs are a vehicle for reporting on the intentions and outcomes of a research project that enable information exchange across relevant parties and systems. They contain an inventory of key information about a project and its outputs (not just data), with a change history that stakeholders can query for updated information about the project over its lifetime. The basic framework requires common data models for exchanging information, currently under development in the RDA DMP Common Standards WG, as well as a shared ecosystem of services that send notifications and act on behalf of humans. Other components of the vision include machine-actionable policies, persistent identifiers (PIDs) (e.g., ORCID iDsfunder IDs, forthcoming Org IDsRRIDs for biomedical resources, protocols.ioIGSNs for geosamples, etc), and the removal of barriers for information sharing.

Read full blog post

Event: National Academies Report Release – Open Science By Design: Realizing a Vision for 21st Century Research, July 17 at 17:30 pm UTC

On July 17th, the National Academies of Sciences, Engineering, and Medicine’s Board on Research Data and Information will host a public release of a consensus study report, Open Science by Design: Realizing a Vision for 21st Century Research.

Wide access to scientific research results has proven to be an important tool for accelerating scientific progress. An ad hoc committee under the Board on Research Data and Information (BRDI) conducted a study on the challenges of broadening access to the results of scientific research, described as “open science.” Ongoing advances in information technologies provide researchers with opportunities to share and access scientific articles, research data, and methodology. Transitioning to more open science should enable increased transparency and reliability, facilitate more effective collaboration, accelerate the pace of discovery, and foster broader and more equitable access to scientific knowledge and the research process. The committee produced a consensus report with findings and recommendations that address these issues, with a focus on solutions that move the research enterprise toward open science.

Remote participation is available via Zoom https://nasem.zoom.us/j/691897124

Register here

View additional details about this event

FOSTER’s draft Open Science training courses are now open for public consultation

FOSTER Plus (Fostering the practical implementation of Open Science in Horizon 2020 and beyond) is a 2-year, EU-funded project, carried out by 11 partners across 6 countries. The primary aim is to contribute to a real and lasting shift in the behaviour of European researchers to ensure that Open Science (OS) becomes the norm. To this end, a key objective for the FOSTER project has been to develop a set of ten training courses targeted towards early career researchers.

The draft courses are now available for public consultation at https://www.fosteropenscience.eu/toolkit. We are seeking feedback from the community on how they could be improved. Please fill in the evaluation form by July 31st and provide as much information as possible to help us build courses that fit your needs.  Please note that the course quiz functionality is still in development so you won’t see any feedback or results yet but please feel free to suggest other quiz questions. The form can be accessed at:https://docs.google.com/forms/d/e/1FAIpQLSfjmfA0lqN09Lt8in4o7bW_IVBPEXnR6fzCvpeC0o3Hyvt72g/viewform

We aimed to reuse existing training content wherever possible and have worked closely with our discipline specific partners to ensure that pointers to discipline specific tools and resources have been included. Ten draft courses covering a range of Open Science topics such as open access, research data, open source software, and open peer-review have been developed over first year of the project. To find out more about the course development, you can read a FOSTER blog post https://www.fosteropenscience.eu/node/1953.

For more on the FOSTER Plus project or to access the FOSTER portal, please see https://www.fosteropenscience.eu/.

Now accepting applications for OWSD fellowship for Early Career Women Scientists

OWSD is now officially accepting applications for the new fellowship for Early Career Women Scientists (ECWS). Simply click on ‘Apply now’ to begin your application. You may also save your application at any time to continue working on it later; click ‘Resume’ to continue. Please note that application materials will be accepted only through the online system.

As a reminder, this fellowship is a prestigious award of up to USD 50,000, generously provided by Canada’s IDRC/CRDI, and is offered to women scientists from Science and Technology Lagging Countries (STLCs) who have completed their PhDs in Science, Technology, Engineering and Mathematics (STEM) subjects and are employed at an academic or scientific research institute in one of the eligible countries. ECWS fellows will be supported for two years to continue their research at an international level while based at their home institutes, to build up research groups that will attract international visitors, and to link with industry.
Though applications must be submitted in English, all information about the programme is also available in French on the website.

The full Call for Applications is here. English  French 

The deadline for completed online applications is 31 August 2018.

Questions may be sent to earlycareer [at] owsd.net

Outcome of the Urban Data Science Summer School – CEPT University, Ahmedabad, India

This post was written by Shaily Gandhi, who is currently pursuing a PhD in Geomatics from CEPT University, India. Shaily attended the CODATA-RDA School of Research Data Science, hosted at ICTP, near Trieste, Italy.

Urban Data Science is a course which is an outcome of the collaboration which took place in CODATA Research Data workshop in Trieste 2017. The course of urban data science was hosted by CEPT University, Ahmedabad, India from May 14 – May 19, 2018 to address the challenges with poor use of available open data in decision making while keeping urban in focus. The course had been designed to get students started with the basic data science components in a short span of 6 days. The aim of the course was to give an insight on open urban Data Sets and insights for interaction with other sources of data freely available. These Data Sets allow a deeper understanding of the urban and its problems, allowing the students to have a firmer control over possible bias and therefore analysing and giving solutions for overcoming thesituations.

The course was carefully designed for students from different backgrounds like planning, architects, civil engineer, geomatics and other disciplines from both bachelors and masters level who belonged to IT and non- IT background. The lessons of the basics of R were prepared by using the material of software carpentry lessons Programming with R and R for Reproducible Scientific Analysis. The concepts were taken and the lessons were redesigned focusing on urban problems and analysis. The school begun with setting a study objective using techniques to develop a research concept, planning area of study, thus bearing in mind the type of data avaible from Open data sets for urban research to be captured, whether continuous, discrete, ordinal or nominal data and the different stages of statistical analysis that can be conducted in other to produce results. Knowledge on research methodologies and implementation of statistical application software’s to support data analysis was one of the vital goals of the course. The Statistical software package called “R” was used as it has become a very powerful and useful tool for the purpose of data cleaning, management, statistical analysis and data graphical visualization. When mastered, this application is user friendly and could reduce the time and efforts of the researcher, student and professionals.

Innovative teaching techniques like mixing theory and practical’s with group work were followed in this course as it had diverse students attending and it required a special attention to keep the whole class in the same pace. Despite the course being intense from morning 9 am till evening 7 pm it was very motivating to see the students following up with the topics and catching up with the pace of the instructor. Daily feedback was taken from the students to enhance class activity decisions by tutors. Course was modified daily with more group activities and practical’s based on students feedbacks received. Continuous constructive comments from the students made it more effective as the tutors were able to achieve the desired output by changing the teaching method according to the requirement of the students. This process of understanding the capability of the students was well appreciated.

The second aim of this course is to transform the traditional teaching techniques into a newer form in which the students have an energetic and innovative involvement to improve the way the course is taught and in the process enhance their proficiencies in solving data driven case studies in practice. By the end of the course it was a great pleasure to receive outputs of the case study which had Data Science for urban studies. Some of the outstanding studies are Traffic Violation in Montgomery County using the data from Public Safety department from the government portal data.montgomerycountymd.gov. Another study on the crime in the city of Chicago was also considered by one of the student where the used of open data was done. Another study was done on the monitoring the trend of border crossing vehicles in USA which showed interesting pattern. Study of air quality for major urban states of US showed interesting pattern stating that majority of the US is affected by medium concentration of PM (Particulate matter). Many more interesting topics were studied which gave a very good inside of the understanding of the students about data science. Students analyzed and interpreted the spatial behavior of the urban data with Geospatial as well as Graph Analysis.

Harnessing the power of the digital revolution: Data- and computation-driven research for Environmental Hazards (Session 191)

Please consider submitting an abstract to Harnessing the power of the digital revolution: Data- and computation-driven research for Environmental Hazards (Session 191), offered under the auspices of SciDataCon 2018, that will take place on 5-8 November 2018in Gaborone, Botswana as part of International Data Week convened by CODATA, the ICSU World Data System and the Research Data Alliance. The deadline to submit an abstract is Monday, 25 Junehttps://www.scidatacon.org/IDW2018/submit/

Important Dates:
Abstract submission:           25 June 2018
Conference:                         5-8 November 2018 in Gaborone, Botswana

Conveners:
Carolynne Hultquist, Guido Cervone, Jenni Evans
The Pennsylvania State University, University Park, PA

Submissions may be made at https://www.scidatacon.org/IDW2018/submit/ – the deadline is 23:59 UTC on Monday 25 June.  Please contact the organisers if you intend to submit a paper to this or any other session and it may be possible to allow a slight extension.

Session 191 – Call for Abstracts:
The study and understanding of environmental hazards is of crucial importance for the survival of our society and future generations. A single event can claim thousands of lives, cause billions of dollars of damage to infrastructure, and destroy the environment.  Nowadays, the risks are the greatest they have been in human history due to the development of mega-cities, dams, nuclear power plants, and other high-risk facilities. We can face these risks by harnessing environmental data to better understand physical properties of hazards, predict the impacts, assess risk to human interests, respond to hazards, and evacuate effectively.
The digital revolution has changed the face of science to take advantage of information technologies and computation that are now part of everyday life. Environmental hazard data are more accessible than ever for research using technologies such as remote sensing, smart sensors, and models. At the same time, an explosion in the use of social media generated new data streams of information that can provide actionable data during emergency situations. These changes led to the collection of unprecedented massive amounts of data about people and their daily interaction with the world. The generation of data is faster than our ability to analyze them, and this is quickly leading towards a data-rich but knowledge-poor environment.
A major challenge is harnessing relevant environmental data for use in computing applications that increase the societal value of the data and can provide assessment of the direct impact of decisions. Methods for increasing the value of data may involve such topics as large-scale data analysis, data mining, data integration, visualization, data exploration or representation. A particular area of interest is handling the challenges of dealing with real-time data and computation to direct changes in such applications as disasters, smart cities, and smart grids.
This session welcomes interdisciplinary environmental research papers at the frontier of the digital revolution in data science and technologies. In addition to environmental science and natural hazards fields, research relevant to this session are likely to come from fields of geography, meteorology, computing, engineering, health, economics, urban studies, management, policy, etc.

SciDataCon 2018: The Digital Frontiers of Global Science
SciDataCon 2018 will address the theme of ‘The Digital Frontiers of Global Science’.  In a hyperconnected world where the internet is pervasive and web technologies are driving major changes in our lives, research has become more than ever before digital and international.  Furthermore, the major societal and scientific challenges facing humanity in this digital age are profoundly global in character, requiring the participation of researchers from all countries and disciplines. The data revolution is also a major source of the scientific opportunities to address these issues but to realize these potentials the frontiers of science, data analysis and stewardship must be advanced.  Likewise, the data revolution must be inclusive, benefitting all, and harnessing all energies: no parts of the world and no disciplines should be left behind.
SciDataCon 2018 seeks to explore the digital frontiers of global science by bringing together research and practice papers from a wide range of perspectives. The scope is explicitly broad and inclusive, addressing all aspects of the role of data in research.

The high-level themes of the 2018 edition are:
  • the digital frontiers of global science;
  • a global and inclusive data revolution;
  • applications, progress and challenges of data intensive research;
  • data infrastructure and enabling practices for international and collaborative research.

Call for Abstracts – ‘Measuring the Impact of Data Citation Practices in Research’ – SciDataCon, part of International Data Week

The organisers of a session on ‘Measuring the Impact of Data Citation Practices in Research’ at SciDataCon part of International Data Week invite the submission of abstracts.

We invite researchers and organisations that are looking at the impact of data citation to consider contributing to this session.

Session Title: Measuring the Impact of Data Citation Practices in Research

Data citation has been advocated across and within many research enterprises globally. Individual researchers have adopted data citation as part of their work and an increasing number of publishers and funders are now encouraging or requiring some level of data citation. The benefits of data citation are clear: besides increasing the visibility of data resources, improving the integrity of research and publications, there is a general trend of acknowledgment and accreditation being associated with data citation. Researchers are beginning to see the value in the citation of their data to be as important as citation of their other outputs.While the benefits extend beyond reuse and accreditation, there is however little insight into the real impact of data citation. A number of questions have to be addressed; for example, what metrics can be used to measure the impact of data citation and how should impact be measured?

Information about submissions for SciDataCon can be found at Submit Abstracts for Papers and Posters: https://www.scidatacon.org/IDW2018/submit/

For further information contact Anwar Vahed, CSIR, Anwar Vahed <avahed@csir.co.za>

Submit Abstracts for Papers and Posters: https://www.scidatacon.org/IDW2018/submit/

Call for Papers and Posters: https://www.scidatacon.org/conference/IDW2018/call_for_papers/

Provisionally Accepted Sessions: https://www.scidatacon.org/IDW2018/sessions/

Themes and Scope of SciDataCon: https://www.scidatacon.org/conference/IDW2018/conference_themes_and_scope/

International Data Week comprises the next Plenary Meeting of the Research Data Alliance and the SciDataCon conference on all aspects of the role of data in research. It is taking place in Gaborone, Botswana, 5-8 November 2018.

The deadline for abstract submissions is 25 June.