Monthly Archives: June 2018

Outcome of the Urban Data Science Summer School – CEPT University, Ahmedabad, India

This post was written by Shaily Gandhi, who is currently pursuing a PhD in Geomatics from CEPT University, India. Shaily attended the CODATA-RDA School of Research Data Science, hosted at ICTP, near Trieste, Italy.

Urban Data Science is a course which is an outcome of the collaboration which took place in CODATA Research Data workshop in Trieste 2017. The course of urban data science was hosted by CEPT University, Ahmedabad, India from May 14 – May 19, 2018 to address the challenges with poor use of available open data in decision making while keeping urban in focus. The course had been designed to get students started with the basic data science components in a short span of 6 days. The aim of the course was to give an insight on open urban Data Sets and insights for interaction with other sources of data freely available. These Data Sets allow a deeper understanding of the urban and its problems, allowing the students to have a firmer control over possible bias and therefore analysing and giving solutions for overcoming thesituations.

The course was carefully designed for students from different backgrounds like planning, architects, civil engineer, geomatics and other disciplines from both bachelors and masters level who belonged to IT and non- IT background. The lessons of the basics of R were prepared by using the material of software carpentry lessons Programming with R and R for Reproducible Scientific Analysis. The concepts were taken and the lessons were redesigned focusing on urban problems and analysis. The school begun with setting a study objective using techniques to develop a research concept, planning area of study, thus bearing in mind the type of data avaible from Open data sets for urban research to be captured, whether continuous, discrete, ordinal or nominal data and the different stages of statistical analysis that can be conducted in other to produce results. Knowledge on research methodologies and implementation of statistical application software’s to support data analysis was one of the vital goals of the course. The Statistical software package called “R” was used as it has become a very powerful and useful tool for the purpose of data cleaning, management, statistical analysis and data graphical visualization. When mastered, this application is user friendly and could reduce the time and efforts of the researcher, student and professionals.

Innovative teaching techniques like mixing theory and practical’s with group work were followed in this course as it had diverse students attending and it required a special attention to keep the whole class in the same pace. Despite the course being intense from morning 9 am till evening 7 pm it was very motivating to see the students following up with the topics and catching up with the pace of the instructor. Daily feedback was taken from the students to enhance class activity decisions by tutors. Course was modified daily with more group activities and practical’s based on students feedbacks received. Continuous constructive comments from the students made it more effective as the tutors were able to achieve the desired output by changing the teaching method according to the requirement of the students. This process of understanding the capability of the students was well appreciated.

The second aim of this course is to transform the traditional teaching techniques into a newer form in which the students have an energetic and innovative involvement to improve the way the course is taught and in the process enhance their proficiencies in solving data driven case studies in practice. By the end of the course it was a great pleasure to receive outputs of the case study which had Data Science for urban studies. Some of the outstanding studies are Traffic Violation in Montgomery County using the data from Public Safety department from the government portal Another study on the crime in the city of Chicago was also considered by one of the student where the used of open data was done. Another study was done on the monitoring the trend of border crossing vehicles in USA which showed interesting pattern. Study of air quality for major urban states of US showed interesting pattern stating that majority of the US is affected by medium concentration of PM (Particulate matter). Many more interesting topics were studied which gave a very good inside of the understanding of the students about data science. Students analyzed and interpreted the spatial behavior of the urban data with Geospatial as well as Graph Analysis.

Harnessing the power of the digital revolution: Data- and computation-driven research for Environmental Hazards (Session 191)

Please consider submitting an abstract to Harnessing the power of the digital revolution: Data- and computation-driven research for Environmental Hazards (Session 191), offered under the auspices of SciDataCon 2018, that will take place on 5-8 November 2018in Gaborone, Botswana as part of International Data Week convened by CODATA, the ICSU World Data System and the Research Data Alliance. The deadline to submit an abstract is Monday, 25 June

Important Dates:
Abstract submission:           25 June 2018
Conference:                         5-8 November 2018 in Gaborone, Botswana

Carolynne Hultquist, Guido Cervone, Jenni Evans
The Pennsylvania State University, University Park, PA

Submissions may be made at – the deadline is 23:59 UTC on Monday 25 June.  Please contact the organisers if you intend to submit a paper to this or any other session and it may be possible to allow a slight extension.

Session 191 – Call for Abstracts:
The study and understanding of environmental hazards is of crucial importance for the survival of our society and future generations. A single event can claim thousands of lives, cause billions of dollars of damage to infrastructure, and destroy the environment.  Nowadays, the risks are the greatest they have been in human history due to the development of mega-cities, dams, nuclear power plants, and other high-risk facilities. We can face these risks by harnessing environmental data to better understand physical properties of hazards, predict the impacts, assess risk to human interests, respond to hazards, and evacuate effectively.
The digital revolution has changed the face of science to take advantage of information technologies and computation that are now part of everyday life. Environmental hazard data are more accessible than ever for research using technologies such as remote sensing, smart sensors, and models. At the same time, an explosion in the use of social media generated new data streams of information that can provide actionable data during emergency situations. These changes led to the collection of unprecedented massive amounts of data about people and their daily interaction with the world. The generation of data is faster than our ability to analyze them, and this is quickly leading towards a data-rich but knowledge-poor environment.
A major challenge is harnessing relevant environmental data for use in computing applications that increase the societal value of the data and can provide assessment of the direct impact of decisions. Methods for increasing the value of data may involve such topics as large-scale data analysis, data mining, data integration, visualization, data exploration or representation. A particular area of interest is handling the challenges of dealing with real-time data and computation to direct changes in such applications as disasters, smart cities, and smart grids.
This session welcomes interdisciplinary environmental research papers at the frontier of the digital revolution in data science and technologies. In addition to environmental science and natural hazards fields, research relevant to this session are likely to come from fields of geography, meteorology, computing, engineering, health, economics, urban studies, management, policy, etc.

SciDataCon 2018: The Digital Frontiers of Global Science
SciDataCon 2018 will address the theme of ‘The Digital Frontiers of Global Science’.  In a hyperconnected world where the internet is pervasive and web technologies are driving major changes in our lives, research has become more than ever before digital and international.  Furthermore, the major societal and scientific challenges facing humanity in this digital age are profoundly global in character, requiring the participation of researchers from all countries and disciplines. The data revolution is also a major source of the scientific opportunities to address these issues but to realize these potentials the frontiers of science, data analysis and stewardship must be advanced.  Likewise, the data revolution must be inclusive, benefitting all, and harnessing all energies: no parts of the world and no disciplines should be left behind.
SciDataCon 2018 seeks to explore the digital frontiers of global science by bringing together research and practice papers from a wide range of perspectives. The scope is explicitly broad and inclusive, addressing all aspects of the role of data in research.

The high-level themes of the 2018 edition are:
  • the digital frontiers of global science;
  • a global and inclusive data revolution;
  • applications, progress and challenges of data intensive research;
  • data infrastructure and enabling practices for international and collaborative research.

Call for Abstracts – ‘Measuring the Impact of Data Citation Practices in Research’ – SciDataCon, part of International Data Week

The organisers of a session on ‘Measuring the Impact of Data Citation Practices in Research’ at SciDataCon part of International Data Week invite the submission of abstracts.

We invite researchers and organisations that are looking at the impact of data citation to consider contributing to this session.

Session Title: Measuring the Impact of Data Citation Practices in Research

Data citation has been advocated across and within many research enterprises globally. Individual researchers have adopted data citation as part of their work and an increasing number of publishers and funders are now encouraging or requiring some level of data citation. The benefits of data citation are clear: besides increasing the visibility of data resources, improving the integrity of research and publications, there is a general trend of acknowledgment and accreditation being associated with data citation. Researchers are beginning to see the value in the citation of their data to be as important as citation of their other outputs.While the benefits extend beyond reuse and accreditation, there is however little insight into the real impact of data citation. A number of questions have to be addressed; for example, what metrics can be used to measure the impact of data citation and how should impact be measured?

Information about submissions for SciDataCon can be found at Submit Abstracts for Papers and Posters:

For further information contact Anwar Vahed, CSIR, Anwar Vahed <>

Submit Abstracts for Papers and Posters:

Call for Papers and Posters:

Provisionally Accepted Sessions:

Themes and Scope of SciDataCon:

International Data Week comprises the next Plenary Meeting of the Research Data Alliance and the SciDataCon conference on all aspects of the role of data in research. It is taking place in Gaborone, Botswana, 5-8 November 2018.

The deadline for abstract submissions is 25 June.

Assessment of Data Management Practices of the Citizen Science and Crowdsourcing Communities: deadline for SciDataCon abstracts on June 25

This is a reminder that the deadline for submitting abstracts for this session on the validation, curation and management of citizen science data at the SciDataCon is on Monday, June 25.  They  would use people’s help in recruiting papers, especially from African citizen science groups. Given costs, it may be most pragmatic to focus on southern African (or South African) groups, but CS groups or researchers from all over the world are welcome to participate.  To submit, go to:

Assessment of Data Management Practices of the Citizen Science and Crowdsourcing Communities

Alex de Sherbinin and Anne Bowser

The objectives of the CODATA–WDS Task Group on citizen science data are to better understand the ecosystem of data-generating citizen science, crowdsourcing, and volunteered geographic information (VGI) projects so as to characterize the potential and challenges of these developments for science as a whole, and data science in particular. Through interviews with principals involved in 50 projects, the task group has assessed the methods and approaches for validating various streams of citizen science data, the mechanisms for cleaning and curating the data, and systems in place for the long-term management, documentation and dissemination of those data. This presentation reports on results of this assessment, and provides recommendations to the citizen science / crowdsourcing community on data quality and management practices.

Turning FAIR Data into Reality – Report and Action Plan Consultation until 5 August

The European Commission’s Expert Group on FAIR Data, chaired by Simon Hodson, CODATA Executive Director, published the interim report ‘Turning FAIR Data into Reality’ and the interim ‘FAIR Data Action Plan’ on 11 June 2018 at the Second EOSC Summit in Brussels.

Interim Report and Action Plan

The interim report and Action Plan are available from the Zenodo repository with the DOI-URLs below:

Consultation until 5 August

Consultation is being conducted on the interim report and Action Plan until 5 August 2018.  A commentable version of the report is available on Google Drive.  Structured comments on the Action Plan and specific recommendations and actions may be made via a dedicated GitHub repository.

The Expert Group will conduct webinars to support and facilitated the consutlation and these will be announced in due course.

About the Expert Group and the Report

Rec. 3: A model for FAIR Data Objects
Implementing FAIR requires a model for FAIR Data Objects which by definition have a PID linked to different types of essential metadata, including provenance and licencing. The use of community standards and sharing of code is also fundamental for interoperability and reuse.

It is recognised that FAIR data (data that are Findable, Accessible, Interoperable and Reusable) play an essential role in the objectives of Open Science to improve and accelerate scientific research, to increase the engagement of society, and to contribute significantly to economic growth. Accordingly, ‘the Open Science agenda contains the ambition to make FAIR data sharing the default for scientific research by 2020.’ The overall objective of the European Commission Expert Group on Turning FAIR data into reality is to help operationalise and facilitate the achievement of this goal.

Rec. 4: Components of a FAIR data ecosystem
The realisation of FAIR data relies on, at minimum, the following essential components: policies, DMPs, identifiers, standards and repositories. There need to be registries cataloguing each component of the ecosystem and automated workflows between them.

To this end, this report that examines the FAIR data principles, considers other supporting concepts and discusses the changes necessary, as well as existing activities and stakeholders to make these interventions. Recommendations and actions are presented as an Action Plan for consideration by the Commission, Member States and leading stakeholders in the research and data communities.

It might have been possible to take a data centric point of view and to work through the FAIR principles slavishly or systematically (depending on your point of view) asking what needs to be done to achieve each one. The Expert Group decided at an early point that this would not be the most effective approach to our task. Rather we felt it was important to take a holistic and systemic approach and to describe the broader range of changes required to achieve FAIR data. It is hoped that what has emerged will be at one and the same time an Action Plan that will be immediately useful and a longer standing survey and discussion, providing a discursive framework for ongoing considerations of how to make FAIR data a reality.

Consultation is open on the interim report and Action Plan and we actively invite constructive feedback. Does the Action Plan highlight the correct priorities? Are the recommendations sound and the actions tangible and achievable? Are they presented in a way that will helpfully guide the stakeholders mentioned? Is the Action Plan sufficiently grounded in the discussions and arguments of the broader report? Given the way this particularly piece of marble has already been cut and carved, what still needs to be done to make a polished statue emerge?

Consultation on the interim report was launched at the EOSC summit on 11 June 2018 and initiated by means of a workshop at that meeting. It will be pursued by online means and by webinars until 5 August. A final version of the Report and Action Plan will be published at the Austrian Presidency event on 23 November.

The group has conducted its work by means of face-to-face and virtual meetings and a lot of asynchronous, collaborative work with the text. All members of the group have contributed substantively and substantially to the text. We hope that we have harnessed the strength and collective wisdom of the Expert Group, while minimising the flaws of group authorship. Our approach has been discursive and we have endeavoured to explore the arguments relating to FAIR in detail to identify the key steps needed for implementation. This is an iterative process and the final version of the report will present a more condensed argument.

The group has been chaired by Simon Hodson, CODATA Executive Director, with Sarah Jones, Associate Director of the Digital Curation Centre, as Rapporteur; but in effect the two have acted as co-chairs.

Membership of the Expert Group

  • Sandra Collins, National Library of Ireland
  • Françoise Genova, Observatoire Astronomique de Strasbourg
  • Natalie Harrower, Digital Repository of Ireland
  • Simon Hodson, CODATA, Chair of the Group
  • Sarah Jones, Digital Curation Centre, Rapporteur
  • Leif Laaksonen, CSC-IT Center for Science
  • Daniel Mietchen, Data Science Institute, University of Virginia
  • Rūta Petrauskaité, Vytautas Magnus University
  • Peter Wittenburg, Max Planck Computing and Data Facility

Reliable ICT Infrastructure a condition for research data sharing – African NRENs to play an important role

Whether it will be called a guideline, a roadmap or a framework – all participants during the AOSP ICT Infrastructure meeting held on 14 May 2018 in Pretoria, South Africa were in agreement that a document guiding African countries in preparing ICT infrastructures in support of research data sharing, will be of benefit to all. The one day meeting brought together key stakeholders. African regional NRENs (National Research Education Networks) attendees included Dr Pascal Hoba (Chief Executive Officer, UbuntuNet Alliance), Dr Ousmane Moussa Tessa (Chief Executive Officer, NigerREN & member of the WACREN Board, on behalf of Dr Boubakar Barry (Executive Director, WACREN), Dr Yousef Torman (Managing Director, ASREN) and Dr Leon Staphorst (Executive Director, SANRen).

The objective of this meeting was to help NRENs better understand the needs experienced by collaborative data intensive research projects, and for NRENs to consider future service delivery in support of research data. The three projects represented included H3ABioNet (Prof Nicky Mulder, Head: Computational Biology, UCT & Lead: H3ABioNet), GBIF (Dr Mélianie Raymond, Senior Programme Officer for Node Development, GBIF Secretariat) and Dr Jasper Horrell (representing the Square Kilometre Array Organisation, SA).

Regional NRENs represented indicated that they are in full support of working with AOSP on developing and populating a framework as part of service delivery to their research communities, and to also invite national NRENs in their respective regions to explore opportunities. Important elements to be included in such a document have been identified, and the group will continue as a working group, building on what is already in place through the SADC Cyberinfrastructure Framework, of which an overview was provided by Prof Colin Wright. This framework was approved by SADC ministers in June 2016, and the next step would be to revisit the existing framework and to adapt – where needed – for the whole of Africa, with input from key stakeholders across Africa. It was also clear that – through possible partnerships and lessons learned from KENET, Ilifu, DIRISA, Sci-GaIA and more, the design, development and implementation of ICT infrastructures in support of data sharing and curation can become a reality – sooner rather than later.

The AOSP ICT Infrastructure Framework will be tested during various stages and across different domains, before it will be finalized to be shared with African countries interested in advancing the sharing and responsible management of data.

Research Data Management: Opportunity for continuing professional development in LIS at UCT

Occasional course in Research Data Management (24 credits)

The Library and Information Studies Centre at the University of Cape Town offers a master’s level course in Research Data Management that is ideal for persons and/or organisations seeking continuing professional development in this new skills areas.
Lifecycle Models | Data Management Planning | Policy Analysis & Development | Challenges to Data Curation
6 weeks, starting 21 September 2018
Closing date for application: 20 July 2018
Entry requirements: NQF level 8 (Honours or equivalent)
Blended online/contact format ideal for students based outside of Cape Town
To apply, visit (On application, apply for Occasional Postgraduate Studies: Level of Qualification: Postgrad Non- Degree; Faculty: Humanities)
Library and Information Studies Centre, University of Cape Town
email: or tel.: 021 650 4546

Register now! AfriGEOSS Week 2018

The AfriGEOSS Week 2018 will take place from 22 to 29 June 2018. The 3rd AfriGEOSS Symposium will be held during AfriGEOSS Week, from 26 to 28 June 2018 with some training sessions taking place beforehand.

The Symposium is hosted by the Agence Gabonaise d’Etudes et d’Observations Spatiales (AGEOS) and the theme is “Building smarter Earth observations to support sustainable development policies”.

The objectives include:

  • Engage with end users, particularly policy and decision makers, to understand information needs for evidence-based policy-making and raise awareness on the value of Earth observations in meeting those needs;
  • Showcase the use of Earth observations in implementing the United Nations Sustainable Development Goals (SDGs) and development policies at national and regional levels;
  • Reinforce dialogue on Earth observations priorities in Africa and promote or build synergies with ongoing and planned Earth observations initiatives at the national, regional and international levels – to draw linkages with the implementation of the African Space Policy according to development policies;
  • Strengthen regional and national thematic Earth observations coordination mechanisms to broaden African participation in the Group on Earth Observations and AfriGEOSS activities; and
  • Review the implementation of the 2017 AfriGEOSS Symposium outcomes and contributions, and establish a mechanism of Monitoring and Evaluation for the future.

For more information and registration visit the AfriGEOSS Week website or contact

Register Now: June 14 Symposium on Statistics and Data Science for a Cyber Secure Internet of Things

Statistics and Data Science for a Cyber Secure Internet of Things 

Rapid growth in the number of devices connected through the internet of things (IoT) poses major challenges to maintaining connectivity, functionality, and security, as demonstrated by prominent cyber attacks launched through IoT devices. Traditional approaches in cyber security such as firewalls and encryption aim to prevent malicious intrusion, however additional countermeasures and approaches are necessary to detect and respond to malicious behavior and to identify when devices or data are compromised.

The National Academies invite you to attend a symposium and webcast on Statistics and Data Science for a Cyber Secure Internet of Things on June 14, 2018 in Washington, DC. During the event, speakers will discuss the role of statistical models and theory for IoT and for detecting, overcoming, and neutralizing cyber attacks.

Date/Time: June 14, 2018 from 1-5 p.m. EDT
Location: Keck Center, Room 100
500 Fifth St. NW, Washington, DC 20001
Or via webcast
Register Now