Model Inputs \(\rightarrow\) Demographic Data \(\rightarrow\) Education Projections


Women’s education is associated with family planning preferences/behaviours and maternal health outcomes. To capture heterogeneity we therefore model each woman’s level of education and urban/rural location. We model 3 levels of education: low (less than primary), middle (less than secondary), and high (completed secondary or higher).


Country-level UNESCO data on educational attainment for females aged 25 years or older was available for 178 countries from 1950 to 2018 - a total of 1214 estimates were available.[1]

To supplement these estimates, and to estimate educational attainment by urban/rural residence, we analyzed DHS data (see DHS datasets).[2] Taking the complex survey design of each DHS survey into account, we estimated the proportion of women with each level of educational attainment aged 25 years and older (n=2,871,974) for consistency with the UN data. We estimated educational attainment overall, and by urban/rural location.

Combining both UNESCO and DHS data we had 1508 observations for 186 countries.

Education Level DHS Levels (v149) UNESCO Levels
Low (less than primary) No education No schooling
Incomplete primary Incomplete primary
Middle (less than secondary) Complete primary Primary (ISCED 1)
Incomplete secondary Lower secondary (ISCED 2)
High (completed secondary or higher) Complete secondary Upper secondary (ISCED 3)
Higher Post-secondary non-tertiary (ISCED 4)
Short-cycle tertiary (ISCED 5)
Bachelor’s or equivalent (ISCED 6)
Master’s or equivalent (ISCED 7)
Doctoral or equivalent (ISCED 8)


To estimate trends in educational attainment over time we fit Bayesian hierarchical logistic regression models.

Separate models were fit for each level of educational attainment (i.e. low/middle/high) and the predictions from each model were re-normalized:

\[P(Ed^i)=\frac{\sigma(\beta_0^i+\beta_1^i\cdot year)}{\sum_{k=1}^3{\sigma(\beta_0^k+\beta_1^k\cdot year)}}\]

where \[\sigma(x)=\frac{e^x}{1+e^x}\]

We obtained 1,000 posterior samples from each model. Sampling was performed using Stan (version 2.19.2) with 4 chains, 1,000 warmup iterations, step size of 0.8, and maximum tree depth of 20.

We evaluated the model fit using posterior predictive checks (i.e. comparing our model posterior predictions to the observed estimates) for estimates from 1990 onwards. These checks revealed a mean absolute error of 0.044 and coverage probabilities (i.e. proportion of times the observed value fell within the model 95% UIs) of 80.65%. See Country Profiles [link] for country-specific education projections.

In each iteration of the simulation model we sample a parameter set from the education posteriors.

Due to lack of data, we used overall country estimates for both urban and rural women in high income countries, and for the following countries with sparse data: Algeria, Belarus, Belize, Bosnia and Herzegovina, Botswana, Bulgaria, China, Costa Rica, Cuba, North Korea, Dominica, Fiji, Grenada, Iran, Iraq, Jamaica, Kiribati, Lebanon, Libya, Malaysia, Marshall Islands, Mexico, Micronesia, Mongolia, Montenegro, Nauru, North Macedonia, Romania, Russia, Saint Lucia, Saint Vincent and the Grenadines, Samoa, Serbia, Solomon Islands, Thailand, Tonga, Turkmenistan, Tuvalu, Vanuatu, Venezuela.

For Dominican Republic, Ecuador, Equatorial Guinea, and Mauritius, the urban projections were unstable so we used overall trends for urban women for these countries.


  1. UNESCO Institute for Statistics (UIS). Share of population by educational attainment, population 25 years and older. February 2019 Release. http://uis.unesco.org/sites/default/files/datacentre/Educational_attainment_-_Niveau_d'%C3%A9ducation_atteint.xlsx
  2. ICF. Demographic and Health Surveys (various) [Datasets]. Funded by USAID. Rockville, Maryland: ICF [Distributor]. https://dhsprogram.com/

GMatH (Global Maternal Health) Model - Last updated: 28 November 2022

© Copyright 2020-2022 Zachary J. Ward