This post was written by Shaily Gandhi, who is currently pursuing a PhD in Geomatics from CEPT University, India. Shaily recently attended the CODATA-RDA School of Research Data Science, hosted at ICTP, near Trieste, Italy – her participation was kindly supported by ICTP and Nature Publishing, via CODATA.
The CODATA-RDA School of Research Data Science was a great opportunity for me to work with around 45 students from 29 countries (mostly from lower and middle income countries) and from varied educational backgrounds. Such summer schools or short courses can be the best platforms for learning innovative ways of teaching as well and understanding the work done by different people in the same area. The summer school introduced me to various aspects of data science and intensive hands on training: it has stimulated in me the confidence to start working with concepts which I had just read in books. Now I will be able to implement machine learning and artificial neural networks in my PhD study in Geomatics for developing predictive models.
The school uses the Software Carpentry / Data Carpentry approach of having the students provide daily feedback on pink or green stickers (which signify XXXX). This was a factor which made each us feel that our opinions count. I am very thankful to the organizers who have been on their toes and have been working long hours to make the summer school run smoothly. While working closely with leading academics in the field of data science, it was one of the most wonderful experience for me which not only taught me but also it helped in improving my teaching skills. I have observed many small things in their teaching which I would like to implement in the coming semester’s teaching.
One of the things which caught my eyes on very first day was the way of using the pink and green sticks for indicating if you are good with the practical or if you need help. I will definitely use this in my teaching because teaching practicals becomes very difficult to handle with a large class and if everyone is waving or calling it makes the environment very noisy.
Apart from technical learning there was a wonderful experience of cultural exchange. One of the most interesting topics which I discussed with Gail Clement from the California Institute of Technology (who introduced us to Author Carpentry) was the loss of academic identity that can be experienced by women who change their name after getting married (and in some countries this change of name is obligatory). She explained that according to the research men’s research works are more cited then women’s: there are many reasons for this and the loss of identify can contribute as computer search mechanisms and bibliographic tools do not necessarily link the works of women prior to and subsequent to a name change. This is one of the important reasons for a recognised and standardsised researcher ID system: for women who have changed their names, having an ORCiD account will help will keep all your academic work associated with on single researcher ID number. Gail also suggested that it would be better if female researchers could retain both the last names which could “help you built your identity and reputation in the professional world”. Many more interesting discussions regarding the ignorance of credit for work were also brought up. In few institutions are the people doing data analysis included as co-authors to the publication: Gail suggested that a standard criteria should be developed and implemented, such that all contributors (including data analysts and data stewards) are credited and the credit for your contribution stays with you.
I had a great learning experience by working with people from different countries in groups. Throughout the school, we were working in different groups with different people which gave us lot of exposure to understand the varied situation of data science in different countries. We worked on a project which allowed us to make work on the same file using Git and in the second project we coded the neural network model in python.
The Bring Your Own Data session offered good suggestions regarding my problems with data and the confusions which had been addressed by other students in the summer school working in the same area. I learned a lot about statistical analysis from other students, including Felix Anyiam (Data Analyst, University of Port Harcourt Teaching Hospital (UPTH)) and Ola Karra (Lecturer, Department of Statistics, University of Khartoum).
This summer school gave us first-hand experience on many languages and command line interfaces: topics included DOS, R, Shell, Github, visualisation of data in most beautiful ways, machine learning, artificial neural networks other machine learning systems and recommender systems.
Working with Github was an excellent experience. I had been using google drives to work on shared presentations but Git looks pretty cool and would like to use it for my future work to share data and work in a shared environment.
It was great working on the research computation infrastructure with all the participants working on different systems and learning how to submit the job and get the job done using external resources. We were taught how to get access to super computers from different geographical locations: this enables researchers to keep going as it allows you to work from any part of the world. Resources to run the processes can be allotted from different locations.
Finally, we also got a good insight into research data management, referencing systems and wonderful tips for publishing and licensing work.
Map of Student participants:
I am very thankful to ICTP for accepting my application and supporting my stay in Trieste. I am very grateful to Nature Publication, via CODATA for funding my travel which gave me an opportunity to attend this summer school on big data Science.