IUSSP (the International Union for the Scientific Study of Population) and CODATA have established a joint working group on FAIR Vocabularies for Population Research. The joint Working Group is co-chaired by George Alter (University of Michigan, IUSSP), with both Arofan Gregory (DDI Alliance and CODATA) and Steven McEachern (Australian National University and DDI Alliance) from CODATA.
The joint Working Group responds to the growing movement to make data “Findable, Accessible, Interoperable, and Reusable” (FAIR). Population research is an empirically focussed field with a long tradition of widely shared, easily accessible data collections. The FAIR Principles point to ways that this tradition can be enhanced by taking advantage of emerging standards and technologies. The joint Working Group will focus on the development of FAIR Vocabularies for population data, which is an essential step in making data reusable and interoperable.
FAIR vocabularies yield benefits when data from different sources must be combined. Consider the most basic variable in demographic analysis: age. OECD has a list of 643 age categories, while the UN Population Division copes with more than 1100 age groups. If the meanings of variables in a dataset are only available through human-readable documentation, like a pdf, harmonizing data from two providers will remain a tedious manual process. However, if the age categories are linked to persistent identifiers in machine actionable metadata, software can be coded to harmonize age groupings. If these operations are performed across dozens of variables in hundreds of data sources, enormous amounts of human time will be saved.
The joint IUSSP-CODATA Working Group will build upon the work of the FAIR Vocabularies Group, who recently released “Ten Simple Rules for making a vocabulary FAIR”. Most of their guidance is straightforward, like “Determine the governance arrangements and custodian responsible for the legacy vocabulary.” But some steps require specialized expertise in standards like Simple Knowledge Organisation System (SKOS) or the Web Ontology Language (OWL). FAIR vocabularies will also need to be maintained, requiring sustainable institutions with the capacity to maintain necessary technologies. The joint Working Group will be advised by members of the FAIR Vocabularies Group, which is chaired by Simon Cox (CSIRO Australia, and CODATA Executive Committee), and experts from other scientific domains will be invited to evaluate alternative strategies (e.g. centralized versus federated) and software.
The operational goal will be to work with three to five partners in international organizations and academia to convert their existing vocabularies to FAIR principles. The joint Working Group will give special attention to coordinating with existing initiatives, like the terminology repository supported by Statistical Data and Metadata eXchange (SDMX).
The ultimate goal of this initiative is to make demographic data more interoperable by publishing controlled vocabularies that can be found and acted upon by software. This has the potential to vastly reduce the costs of merging data from multiple sources for researchers seeking to use population data. The joint Working Group will learn where additional technical development is needed and when community involvement through IUSSP and other organizations is beneficial. A two-year work plan is envisioned.
Colleagues interested in learning more about this new initiative or participating in the work of this joint Working Group should contact George Alter (FAIRvocab@iussp.org).