Skip to main content

Natural language processing and network analysis provide novel insights on policy and scientific discourse around Sustainable Development Goals

The sustainable development goals (SDGs) and their 169 constituent targets are intended to be overlapping and interdependent. To establish efficient, harmonious strategies for addressing groups of SDGs (as opposed to singular goals), achieving policy coherence, it is essential that we learn to understand the complex network of relationships beneath the surface of the goals.

In our recent interdisciplinary Scientific Reports publication, we performed an analysis at the intersection of linguistics, computer science, network science, and sustainability science. We trained a natural language processing (NLP) ‘embedding’ algorithm powered by a shallow neural network, document-to-vector (or doc2vec), to generate quantitative (or numeric) representations of the United Nations’ (UN) policy discourse, as described in the UN’s Progress Towards the Sustainable Development Goals reports, surrounding each of the goals. The correlation, more specifically cosine similarity, between these numeric representations provided an approximate measure of semantic similarity between the SDGs.

Figure 1. Overlap in policy discourse between the sustainable development goals. The network in 1b showcases the discursive divide which exists between socioeconomic and environmental goals.

We observed high correlations between all SDGs, indicating an above-average level of integration. Consistent with its targets, many of which are broadly relevant to some or all the other goals, SDG 17 Partnerships for the Goals was at the core of the UN’s policy discourse. Problematically, we discovered a clear discursive divide between the environmental goals (SDGs 12 Responsible Consumption and Production, 13 Climate Action, 14 Life Below Water, 15 Life on Land), which focus predominantly on earth preconditions, the natural environment, planet, and biosphere, and the remaining socioeconomic goals. Finally, among the socioeconomic goals, we found three broad areas of focus: (1) macroeconomic and infrastructural development (SDGs 7 Affordable and Clean Energy, 8 Decent Work and Economic Growth, 9 Industry, Innovation, and Infrastructure, 10 Reduced Inequalities, and 17 Partnerships for the Goals); (2) infrastructural development of human settlements (SDGs 6 Clean Water and Sanitation and 11 Sustainable Cities and Communities); (3) social and economic development (SDGs 1 No Poverty, 2 Zero Hunger, 3 Good Health and Well-Being, 4 Quality Education, 5 Gender Equality, and 16 Peace, Justice, and Strong Institutions).

We further analyzed collaboration among researchers studying a subset of the SDGs, finding misalignment between policy discourse, the goals as discussed by the UN in policy documents, and scientific collaboration, based on peer-reviewed publication co-authorship among authors studying one or more of the SDGs in the subset. This suggests that researchers are not necessarily interested in the same areas of SDG integration as the UN.

Ultimately, NLP methods can help identify (or construct) ‘zipper concepts’, conceptual bridges between the goals and their targets. Improving our understanding of the SDG interdependencies buried in scientific and policy discourse will help enhance policy coherence and may prove instrumental in getting the sustainable development goals back on track as we approach 2030.

By Thomas Smith, Circular Health Fellow

Smith, Thomas Bryan, Raffaele Vacca, Luca Mantegazza, & Ilaria Capua (2021) Natural language processing and network analysis provide novel insights on policy and scientific discourse around Sustainable Development Goals. Scientific Reports, 11:22427. DOI: 10.1038/s41598-021-01801-6

Leave a Reply

Your email address will not be published. Required fields are marked *