Urban Data Science School from May 13 – May 23, 2019

This article was first published by instructors Dr. Shaily R. Gandhi and Felix Emeka Anyiam https://shailygandhi.github.io/UrbanDataScience2019/ – Shaily and Felix are both alumni of the CODATA-RDA School of Research Data Science.

The second summer school on Urban Data Science was conducted following the successful completion of the first summer school on Urban Data Science in 2018 which is an outcome of the collaboration which took place at The CODATA-RDA Research Data Science Summer School in Trieste, Italy 2017. This year the course Urban Data Science was hosted by the Summer Winter School CEPT University, Ahmedabad, India from May 13 – May 23, 2019.

With the upcoming trend of data driven solutions for use at the central level for making city operations more efficient and effective; the next generation of city planners will need to be as comfortable using advanced simulation algorithms as it is with design. This course helped to address the challenges with poor use of available open data in decision making while keeping urban in focus. This summer school course had been modified to get students started with the basic data science components in a short span of 10 days. This year the course had an additional 4 days which helped in making more insightful results from the open data sets that the last years 6 days course. Open data sets allows for a deeper understanding of the urban dynamics and its associated challenges, allowing the students to have a firmer control over possible bias and therefore analysing and giving solutions for overcoming these observed challenges.

The course this year was carefully modified with the feedback of students from the previous summer school of 2018, keeping in mind that the 24 new intakes are from different backgrounds like planning, architects, civil engineer, geomatics and other disciplines from both bachelors and masters level who belonged to IT and non- IT backgrounds. The curriculum covered basics of Git and Git hub, where students got an extremely intense hand on practical experience in using the software and learning how to open up their projects on GitHub. More over Open Refine, R and excel was covered for data cleaning. The lessons of the basics of R were prepared by using the material of software carpentry lessons Programming with R, R for Reproducible Scientific Analysis and Geospatial Data workshop. The concepts were taken from various sources and the lessons were redesigned focusing on urban problems and analysis.

The school begun with students understanding the concepts behind setting-up their study objectives towards enhanced conceptualized Research titles and using techniques to develop a research theory, planning the area of their study, thus bearing in mind the type of data available from Open data sets to be captured, whether continuous, discrete, ordinal or nominal data and the different stages of statistical analysis that can be conducted in other to produce the expected outcomes. Knowledge on research methodologies and implementation of statistical application software’s to support data analysis was one of the vital goals of the course. The Statistical software package called “R” was used as it has become a very powerful and useful tool for the purpose of data cleaning, management, statistical analysis and data graphical visualization. When mastered, this application is user friendly and could reduce the time and efforts of the researcher, student and professionals. The word cloud below shows the number of technology students had explored during the summer school.

Urban Data Science Summer School 2019

Innovative teaching techniques like mixing theory and practical’s with real life examples were followed in this course as it had diverse students attending and it required a special attention to keep the whole class on the same pace. Despite the course being intense from morning 9:30am till evening 5:30pm, it was very motivating to see the students following up with the topics and catching up with the pace of the instructors. To better understand the various levels of the 24 students, we conducted a pre and post summer school survey which gave us an idea about how well the school has changed the perspective of the students for programming in R to being confident in using Git and OpenRefine. Daily feedback was taken from the students similar to the last years practice to enhance class activity decisions by tutors. Continuous constructive comments from the students made it more effective as the tutors were able to achieve the desired output by changing the teaching method according to the requirement of the students. This process of understanding the capability of the students was well appreciated and implemented.

Urban Data Science Summer School 2019 was well appreciated by the students and the outcomes of the course were very insightful with statistical evidence. The topics selected by the students and its frequency is shown in the below Word Cloud. Urban planning and decision making consists of insight—and this insights are collected and analysed using open data sets in other to know how things are in our environment today, which this course promoted deeply. The role of Urban Data Science is in enhancing Urban Planning and Policy-making with more data driven decisions which is in need at this time. The students of this summer school came up with wonderful insights and results. It was a great pleasure to receive outputs of the case study in various topics such as: Crime, the Economy, Education, Governance and Planning, Environment, Public Health, Road Accident, Sports, etc.

Urban Data Science Summer School 2019

Linda Reeba Koshy, a student of the summer school’s project was a case study on the Prevalence of Obesity among Socially Vulnerable Groups in the United States, with interesting results proving that Obesity is prevalent among Ethnic/Racial minorities, and that socioeconomic, racial factors influence obesity in children and the elderly. Also, persons from Low income households and lower educational levels were more likely to be obese due to their poor dietary choices. A second study on analyzing the performance of Indian states and union territories in terms of Sustainable Development Goals (SDG) for the year 2018 by Kavina Mehta recommended from the analysis that Law Enforcement and Policy Interventions should be the first steps towards enhancing Indian’s sustainable development targets along with political willingness. The study on the Understanding of the Pattern of Terrorist Attacks in India by Pooja Toshniwal, concluded that more number of attacks are happening in Jammu and Kashmir using various types of weapons. This analysis of attacks helps in understanding the pattern of attacks which could be used by defence to halt future attacks. Contribution of Education in Development of countries across the world by Surabhi Samant threw more light on some of the un-expecting factors about the literacy rate which is significantly affected by child marriage, child labour, and poverty. There is no significant impact on government expenditure which means it is not about spending money but also the implementation of the right mechanism. This would be contextual to every country and its economic status. The study also concluded that the literacy rate has a significant impact on Human Development and Happiness Index of a country, and moderate impact on Gross Domestic Product (GDP). In principle, education not only encourages economic growth but also assures quality life and overall development of a country. Many more interesting studies were carried out under this course. In conclusion, the inclusion of Urban Data Science in the SWS curriculum is priceless, as it brought an exponential improvement in the scholastic learning of the participants towards their data and spatial analytics enhancement via visualization and performance.