My Vision for CODATA
I very much support the three major strategic programs put forward in CODATA’s Strategic Plan 2013 – 2018, namely:
- Data Principles and Practice
- Frontiers of Data Science
- Capacity Building
However, given the promising developments of the last five years it is now time to develop a third strategic plan covering the next five years of the CODATA organization. Development of this new strategic plan must be a major priority for CODATA and it will be important to reach out to all the relevant national and international stakeholder organizations for their input. However, in addition to CODATA’s traditional stakeholders, I would also like to learn from the experience of other major efforts in this space. For example, from the US, this could include input from the NIH’s National Library of Medicine, the DOE’s OSTI organization and the NSF’s DataONE project. From Europe, there will be much activity in creating an implementation of the European Open Science Cloud (EOSC). I would also look for input from other major data science initiatives in Asia and Australia.
In addition to developing detailed plans and deliverables for the three broad CODATA priority areas for the next five years, I would like to give my support to two other areas. During my career in data-intensive science – in the UK with e-Science and in my work with Microsoft Research in the US – I have worked closely with universities and funding agencies in Europe, North and South America, Asia and Australia. I now think it is important to dedicate more attention to Africa where I think CODATA can play a significant role. I am therefore personally very supportive of the existing CODATA initiative to develop an African Open Science Platform and would look for ways to extend this initiative and increase its impact. One way in which to do this is to harness CODATA’s global reach and influence which can successfully bring together countries at many different levels of economic development. The international SKA project will also generate many interesting computing, data science and networking challenges in Africa.
The second focus I would like to develop is related to my present role as leader of the Scientific Machine Learning research group at RAL. There is now much activity world-wide in the application of the latest advances in AI and Machine Learning technologies to scientific data. This is one of the few areas where the academic research community has large and complex data sets that can compete with the ‘Big Data’ available to industry. Extracting new scientific insights from these datasets will require the use of advanced statistical techniques, including Bayesian methods and ‘deep learning’ technologies. In addition, an extensive education program to train researchers in the application of these data analytic technologies will be necessary and can build upon practical experience in applying such methods to ‘Big Scientific Data.’ In this way CODATA can help train a new generation of data analysts who are not only able to generate new insights from scientific data but also to spur innovation with industry and aid economic development.
While at Microsoft Research, I was a founding Board member of the RDA organization. As an RDA Board member, I liaised extensively with both the NSF in the USA, and with the Commission in Europe, and assisted in facilitating the constructive cooperation of RDA with CODATA. I will therefore bring extensive management experience to the leadership of CODATA – from my experience in the university sector as research group leader, department chair and dean of engineering, in UK research funding councils as a program director and chief data scientist, and in industry as manager of a globally distributed outreach team. I am disappointed to see the absence of many European countries from the CODATA membership and, through my experience in European research projects, I would seek to encourage these missing nations to become members of the organization. In addition, in my role at Microsoft Research, I spent considerable time visiting universities and funding agencies in Central and South America, and in Asia. I believe there is considerable potential to interest non-member countries in these regions in the relevance of the data science agenda of CODATA. Finally, although I will certainly bring my vision, enthusiasm and energy to the role of CODATA President, I believe that we must harvest the energy and enthusiasm of the entire CODATA community to take the organization forward to a new level of influence and effectiveness.