This section provides a detailed account of the data, modelling framework, and analytical procedures used in our study. We begin by describing the source studies and inclusion criteria for the individual-level data. We then outline the inferential framework, including how we adjusted for anaemia-related exclusion bias (full code for which is available at https://patrickgtwalker.github.io/Anaemia_malaria_fitting_open/vignette/malaria_hb_vignette.html). This is followed by descriptions of the Hb dynamics model, the model of malaria infection in pregnancy (full code for which is available https://github.com/patrickgtwalker/malaria_in_pregnancy_istp_model_open/releases/tag/v1.0-istp_paper) and their integration. We conclude with the estimation of IPTp-SP effectiveness and the procedure for generating continent-wide estimates of malaria-attributable anaemia burden and intervention impact.
Malaria and anaemia data
For our analysis, we included studies that fulfilled the following criteria: (1) women received no recorded intermittent preventive treatment during pregnancy at the time of sampling; (2) P. falciparum infection status according to PCR and Hb concentration were recorded at the same point during gestation; (3) women were recruited throughout the first and second trimester.
We analysed data from four trials that compared intermittent preventive treatment in pregnancy with IPTp-SP to alternative strategies, including IPTp with dihydroartemisinin–piperaquine, with or without azithromycin, or intermittent screening and treatment (IST) with dihydroartemisinin–piperaquine or artemether–lumefantrine. These were a trial conducted in southern Malawi between 2011 and 2013 (Malawi IST)24, a trial in western Kenya between 2012 and 2014 (Kenya IST)25, a multi-country trial conducted in Burkina Faso, Mali, The Gambia and Ghana between 2010 and 2011 (West Africa trial)27, and the IMPROVE-1 trial conducted between 2018 and 2019 across Kenya, Malawi and Tanzania (IMPROVE-1)26. In all four trials, women had not yet received IPTp at the time of enrolment.
Most trials recruited women regardless of Hb concentration and gravidity, with Hb measured with a point-of-care haemoglobinometer using capillary blood samples. However, two trials—Malawi IST and Kenya IST—excluded women with severe anaemia (Hb −1) while the West Africa trial recruited only primigravidae and secundigravidae (G1 or G2). Across the four studies, individual-level data were available for 1,803 women from the Malawi IST, 1,396 from the Kenya IST, 5,208 from the West Africa trial (including 1,413 from Burkina Faso, 1,193 from The Gambia, 1,298 from Ghana and 1,304 from Mali) and 4,201 from IMPROVE-1 (1,297 from Kenya, 1,283 from Malawi and 1,621 from Tanzania).
In total, we analysed data from 12,608 women across 7 countries—Burkina Faso, The Gambia, Ghana, Kenya, Malawi, Mali and Tanzania. These included individual-level records on Hb concentration, PCR-based malaria status, gravidity and gestational age at enrolment. We used gestational age at enrolment as a proxy for the length of time a malaria infection remained untreated, allowing us to characterize Hb dynamics across the first and second trimesters of pregnancy.
Disentangling age and gravidity
Given the close correlation between maternal age and gravidity, we first fitted mixed-effects regressions to assess whether each independently modified the effect of malaria on Hb. We adjusted for gestational age using a natural cubic spline and included a random intercept for study site to account for between-site heterogeneity. Three nested specifications were evaluated: (1) gravidity-specific malaria effects; (2) gravidity-specific effects plus a global interaction between malaria and age; and (3) a three-way interaction between malaria, age and gravidity. Model selection was guided by the Akaike information criterion; and the strength of evidence for improved fit attributable to age was assessed using likelihood ratio tests. To visualize patterns transparently, we also ran stratum-specific contrasts using estimated marginal means. Finally, we compared gravidity-specific coefficients from models with and without adjustment for age, confirming that the simpler specification adopted in our inferential framework (malaria according to gravidity interaction only) did not materially alter gravidity-specific results.
Inferential framework
We developed a hierarchical Bayesian model to characterize Hb concentrations across pregnancy, accounting for malaria infection, gestational age, gravidity and exposure history. Our inferential framework was structured to handle two key scenarios: studies with unrestricted enrolment and studies where women with severe anaemia (Hb −1) were excluded at baseline.
Studies where women were recruited at enrolment without anaemia-specific exclusion criteria
For studies without anaemia-based exclusion, we modelled the joint likelihood of observed Hb concentrations given malaria status (as measured using PCR), gestational age and gravidity. Let \({H}_{s}=\left\{{h}_{s,i}{;i}\in 1..{n}_{s}\right\}\) denote the set of Hb values for study \(s\), with \({{M}_{s},T}_{s}\) and \({G}_{s}\) representing malaria infection status, gestational age and gravidity, respectively. The parameters \(\theta\) define the Hb trajectory model and \(\varTheta\) represents parameters from our existing malaria-in-pregnancy transmission model.
We specified the posterior as:
$$P\left(\theta |{H}_{s},{M}_{s},{T}_{s},{G}_{s}\,\varTheta \right)\,\propto \,P({H}_{s}|{M}_{s},{T}_{s},{G}_{s},\,\varTheta ,\,\theta )\,P(\theta )$$
(1)
To account for prior exposure, we incorporated an integrated likelihood:
$$P\left({{H}_{s}{|M}}_{s},{T}_{s},{G}_{s},{y}_{{sg}},{n}_{{sg}},\Theta ,\theta ,{\pi }_{s1}\right)=\mathop{\prod }\limits_{i=0}^{{n}_{s}}f({h}_{{si}},{m}_{{si}},{t}_{{si}},{g}_{{si}},\Theta ,\theta ,{\pi }_{s1})$$
(2)
where
$$\begin{array}{rcl}\begin{array}{l}f({h}_{{si}},{m}_{{si}},{t}_{{si}},{g}_{{si}},\Theta ,\theta ,{\pi }_{s1})\\ =\displaystyle \int P({h}_{{si}}|{m}_{{si}},{t}_{{si}},{g}_{{si}},{e}_{{si}},\theta )P({e}_{{si}}|{\pi }_{s1},{g}_{{si}},\Theta )d{e}_{{si}}P({{\pi }_{s1}|y}_{s1},{n}_{s1}).\end{array}\end{array}$$
(3)
\(P\left({{h}_{{si}}{|m}}_{{si}},{t}_{{si}},{g}_{{si}},{e}_{{si}},\theta \right)\) represents the model linking malaria exposure in previous pregnancies and other model parameters to be estimated to Hb concentration at enrolment (see ‘Hb dynamics model’). \(P\left({e}_{{si}}|{\pi }_{s1},{g}_{{si}},\Theta \right)\) represents the exposure to malaria in previous pregnancies given primigravid prevalence within a setting and individual-level gravidity, uncertainty in which is integrated over (see ‘Malaria in pregnancy model’). \(P\left({\pi }_{s1}|{y}_{s1},{n}_{s1}\right)\) represents a binomially distributed likelihood of primigravid prevalence given the observed infection status of primigravids within the study. This accounts for uncertainty in prior malaria exposure \({e}_{{si}}\) as a function of observed primigravid PCR prevalence \({\pi }_{s1}\) (see ‘Malaria in pregnancy model’). \(P\left({\pi }_{s1}|{y}_{s1},{n}_{s1}\right)\) represents a binomially distributed likelihood of primigravid prevalence given the observed infection status of primigravids within the study.
Studies with anaemia-based exclusion
To adjust for exclusion of women with Hb −1, we extended the model to include latent censoring and marginalized over the unobserved distribution of excluded women. Let \(Z\) be the Hb threshold, and \(\,{c}_{s}\) the number of excluded women. We introduced the additional parameters \(\Pi =\{{\pi }_{{sg}};g=1\ldots {n}_{g}\}\), the set of prevalences according to gravidity category before censoring (modelled as log-odds \({o}_{{sg}}\)), \(\varOmega =\{{\sigma }_{{sg}};g=1\ldots {n}_{g}\}\); the proportion of women in each gravidity category before censoring (modelled as multipliers relative to the proportion of primigravid women \({{RR}}_{{sg}}\)) and \({\varLambda }_{s}\) represent the set of parameters consisting of the mean and s.d. of the set of gestational age at enrolment before censoring (modelled according to a Beta distribution scaled between an assumed minimum and maximum enrolment time of 80 and 200 and discretized into integer days).
The adjusted posterior becomes:
$$\begin{array}{l}P\left({\theta ,{\varOmega }_{s},{\varLambda }_{s}|H}_{s},{M}_{s},{T}_{s},{G}_{s},{c}_{s},Z,\varTheta \right)\\ \propto {P}^{* }\left({H}_{s}|{M}_{s},{T}_{s},{G}_{s},Z,\varTheta ,\theta \right)P\left({c}_{s}|{p}_{c},{n}_{s}\right)\\ \times P\left({p}_{c}|Z,\theta ,\varTheta ,{\varPi }_{s},{\varOmega }_{s},{\varLambda }_{s}\right)P\left({M}_{s},{T}_{s},{G}_{s}|Z,\theta ,\varTheta ,{\varPi }_{s},{\varOmega }_{s},{\varLambda }_{s}\right)P\left(\theta ,{\varPi }_{s},{\varOmega }_{s},{\varLambda }_{s}\right)\end{array}$$
(4)
where \({P}^{* }(.)\) denotes likelihoods conditioned on non-censoring:
$${P}^{* }\left({h}_{{si}}|\ldots \right)\propto \frac{f\left({h}_{{si}},{m}_{{si}},{t}_{{si}},{g}_{{si}},\varTheta ,\theta ,{\pi }_{s1}\right)}{{\int }_{h=0}^{Z}\,f\left(h,{m}_{{si}},{t}_{{si}},{g}_{{si}},\varTheta ,\theta ,{\pi }_{s1}\right){dh}}$$
(5)
with \(f(h,{m}_{{si}},{t}_{{si}},{g}_{{si}},\varTheta ,\theta ,{\pi }_{s1})\) representing the uncensored likelihood from equation (3).
The probability of exclusion is:
$$P\left({{c}_{s}{|p}}_{c},{n}_{s}\right)\propto {{p}_{c}}^{{c}_{s}}{(1-{p}_{c})}^{{n}_{s}}$$
(6)
with \({p}_{c}\) marginalized over malaria status, gravidity and gestational age.
This framework allows us to recover the likely distribution of Hb in the underlying population, appropriately corrected for bias introduced by the exclusion criteria.
Hb dynamics model
We evaluated a range of models to describe Hb concentration across pregnancy, incorporating different assumptions regarding the effect of malaria infection, gravidity and gestational timing. Models were compared using the deviance information criterion, and the final model was selected based on goodness of fit and biological plausibility.
We modelled individual-level Hb values as following a normal distribution:
$${h}_{{si}} \sim N(\,{\mu }_{{si}},\sigma )$$
(7)
where \({\mu }_{{si}}\) is the expected Hb for individual \(i\) in study \(s\), and \(\sigma\) is the s.d. assumed constant across individuals. The mean \({\mu }_{{si}}\) is defined as:
$${\mu }_{{si}}=\alpha \left({t}_{{si}}\right)+{m}_{{si}}\beta \left({t}_{{si}}\right)\phi \left({e}_{{si}}\right)+{\gamma }_{{g}_{{si}}}+{\delta }_{s}$$
(8)
In this formulation, \(\alpha \left(t\right)\) is a cubic spline function representing the typical trajectory of Hb across gestation in uninfected primigravidae. The function \(\beta \left(t\right)\) captures the time-varying effect of malaria infection on Hb. The binary variable \({m}_{{si}}\) indicates whether individual \(i\) was infected with malaria at enrolment. The effect of prior malaria exposure is incorporated via the function \(\phi (e)\), which modifies the effect of malaria infection as a function of \({e}_{{si}}\), the number of prior pregnancies in which the woman was likely exposed to malaria. The term \({\gamma }_{{g}_{{si}}}\) represents a gravidity-specific adjustment to the Hb intercept, with \({\gamma }_{1}=0\) for primigravidae. A study-specific intercept \({\delta }_{s}\) accounts for site-level differences in Hb measurements.
The exposure adjustment function \(\phi (e)\) was defined as a power-law function:
$$\phi \left(e\right)=\left(1+{\left(e/\nu \right)}^{\kappa }\right)$$
(9)
where ν and κ are the shape parameters estimated from the data.
We also considered alternative models, including those with no effect of malaria (\(\beta (t)=0\)), a constant malaria effect (\(\beta (t)=\beta\)), gravidity-specific malaria susceptibility independent of exposure history (\(\phi (g)={v}_{g}\)) and models with or without gravidity-dependent baseline Hb (that is, \({\gamma }_{g}=0\)) (see Supplementary Table 1 for a full list of models tested and Supplementary Table 2 for a list of parameter definitions, prior distributions and posterior summaries for the final selected model).
Malaria in pregnancy model
Full details of our model of malaria in pregnancy are available elsewhere23 and our code has been placed in an open-source repository (https://github.com/patrickgtwalker/malaria_in_pregnancy_istp_model_open). In this section, we detail key features and our approach to capturing patterns of previous exposure according to site and gravidity within our fitting.
The model follows a cohort of women of child-bearing age from 15 to 49 years old. Age at each conception throughout a woman’s lifetime, denoted by \(C=\{{C}_{g}\}\), where \(g=1,..,F\) and \(F\) represents the total pregnancies, generated according to gravidity-specific fertility rates stratified according to 5-year age groups (typically calculated from Demographic Health Surveys (DHS) and MIS50).
The extent to which women are exposed to malaria in pregnancy is generated using a customized version of an established malaria transmission model (Supplementary Fig. 1). At conception, women are assigned an infection state: uninfected (S), untreated clinical malaria (D), treated clinical malaria (T), prophylaxis after treatment (P), asymptomatic patent (A) or sub-patent (U). The probabilities of initial states depend on the entomological inoculation rate, age and individual heterogeneity in mosquito exposure (modelled via a Gauss–Hermite approximation to a log-normal distribution).
Each woman is assigned a force of infection reflecting her expected rate of exposure throughout gestation. Times of blood-stage infections during pregnancy, \({B}_{g}=\left\{{B}_{g,i}\right\}\), are generated as follows:
-
If the woman is infected at conception, \({B}_{g,1}=0\), else \({B}_{g,1}=X\left(\lambda \right)\), where \(X(.) \sim {\rm{Exp}}(.)\)
-
Subsequent infections occur as \({B}_{g,j+1}={B}_{g,j}+X\left(\lambda \right)\)
Clearance times for each infection (assuming no pregnancy-specific effects), \({K}_{g}=\left\{{K}_{g,i},i=1..{n}_{B}\right\}\), are similarly drawn according to the underlying transmission model parameters.
From week 13 of gestation onwards, peripheral infections may sequester in the placenta. Sequestered infections \({P}_{g}=\{{P}_{g,i},i=1\ldots {n}_{P}\)} resolve at times \({R}_{g}=\{{R}_{g,i},i=1\ldots {n}_{P}\)} with durations depending on gravidity-specific pregnancy immunity.
PCR positivity at time \(t\) in pregnancy, \({x}_{g}^{P}(t)\), is defined as
$${x}_{g}^{P}(t)=1\left(\left[\mathop{\sum }\limits_{j=1}^{{n}_{B}}\left[1\left({{{{\rm B}}}}_{g,j} t\right)\right]+\mathop{\sum }\limits_{j=1}^{{n}_{P}}\left[1\left({P}_{g,i} t\right)\right]\right]\ge 1\right)$$
(10)
We then link this model to our Hb model via:
$${h}_{g}(t) \sim N\left({\mu }_{g}(t),\sigma \right),\,{\mu }_{g}(t)=\alpha \left(t\right)+{x}_{g}^{P}(t)\beta \left(t\right)\phi \left({e}_{g}\right)+{\gamma }_{g}$$
(11)
where \({\mu }_{g}(t)\) is the expected Hb level given gestational age \(t\), gravidity \(g\), infection status \({x}_{g}^{P}(t)\) and \({e}_{g}\) the number of prior pregnancies exposed to malaria.
Simulation outputs from this model are used both to predict Hb distributions under different exposure scenarios (see ‘Generating estimates of the burden of malaria-associated anaemia and impact of IPTp across Africa’) and to estimate the expected distribution of prior exposure given infection, which is used within the likelihood function during model fitting (see ‘Inferential framework’).
Fitting the model to the data
We fitted the model using the drjacoby R package (v.1.5.4), which provides a flexible platform for Bayesian inference via Markov Chain Monte Carlo. We used weakly informative priors (Supplementary Table 1) and ran two chains for each model to monitor convergence.
Chains were run with sufficient burn-in and thinning to achieve an effective sample size of at least 1,000 per parameter. Convergence was assessed via trace plots, Gelman–Rubin diagnostics and posterior density comparisons between chains.
All posterior summaries and parameter estimates are based on the retained post-burn-in samples. Details of the priors, hyperparameters and model variants tested are summarized in Supplementary Tables 1 and 2.
Estimating the effectiveness of IPTp-SP upon Hb levels
To estimate the effect of IPTp-SP on maternal Hb, we analysed studies from a Cochrane review of the effect of the intervention versus placebo. Data from three studies were included in our primary analysis on the basis that they reported the mean and variance of Hb levels in both trial arms and did not exclude women based on Hb concentration, with additional studies that had such exclusion criteria included for qualitative comparison (Supplementary Table 4).
We defined the proportional effectiveness of IPTp-SP as the fraction according to which the intervention mitigates the Hb reduction attributable to malaria. Let \(Y=\{{y}_{r}\}\) and \(S=\{{s}_{r}\}\) denote the mean and s.d. of Hb increases observed in each trial and gravidity stratum pairing. Let \(M=\{{m}_{r}\}\) be the modelled malaria-attributable reduction in Hb in the absence of IPTp-SP. We defined the likelihood of the observed data under this model as:
$$l\left(Y,\,{S|}\varepsilon ,M\right)=\mathop{\prod }\limits_{r=1}^{{n}_{r}}l\left(\,{y}_{r},\,{s}_{r}|\varepsilon ,{m}_{r}\right),\text{where}\,{y}_{r} \sim N(\varepsilon \times {m}_{r},{s}_{r})$$
(12)
To generate the predicted \({m}_{r}\) values, we used our transmission and Hb models, conditioning on primigravid PCR prevalence, gravidity distribution and timing of malaria exposure. As the studies included only reported slide-positive prevalence, we used previously estimated relationships between microscopy and PCR detection to infer the PCR prevalence in each trial23. Specifically, we assumed that the microscopy prevalence in primigravidae followed a Beta distribution, reflecting the posterior uncertainty under a binomial model with a flat prior:
$${\rho }_{r}^{m} \sim \mathrm{Beta}\left({{sm}}_{1r}+1,{n}_{1r}-{{sm}}_{1r}+1\right)$$
(13)
where \({{sm}}_{1r}\) is the number of positive primigravidae according to slide microscopy and \({n}_{1r}\) the total sampled.
We then sampled 1,000 realizations of PCR prevalence using a logistic transformation with uncertainty propagated from previously estimated parameters:
$${\rho }_{{ri}}^{p}=\mathrm{expit}\left(\frac{\mathrm{logit}({\rho }_{ri}^{sm})-{\zeta }_{{\rm{i}}}}{1+{\eta }_{{\rm{i}}}}\right)$$
(14)
where \({\rm{expit}}(x)=1/(1+{e}^{-x})\). For studies lacking gravidity-specific microscopy prevalence (for example, ref. 37), we used study-level information and simulated the odds ratio between primigravidae and secundigravidae using a triangular distribution, incorporating empirical bounds from the ISTp trials.
\({\rho }_{{ri}}^{p}\) for the ith draw was then used to simulate the expected malaria-attributable reduction associated with that draw \({m}_{{ri}}\) and the likelihood was integrated over the 1,000 simulations. We maximized the marginal likelihood with respect to \(\varepsilon\) and used importance sampling to generate a posterior distribution.
Extended Data Fig. 1 shows the model fitted to the Hb data from each study. Results were consistent with those from other trial arms not included in the primary fitting, for example, monthly sulfadoxine-pyrimethamine regimens and high-gravidity strata in high-transmission settings supporting the robustness of the estimate.
Generating estimates of the burden of malaria-associated anaemia and impact of IPTp across Africa
To estimate the continent-wide burden of malaria-attributable anaemia in pregnancy, we simulated Hb trajectories under observed transmission conditions in 2023 using our combined model of malaria in pregnancy and Hb dynamics. The model was implemented across a 0.2-degree (approximately 5 km²) spatial grid, aligned with the MAP 2023 posterior distribution of slide prevalence in children aged 2–10 years. For each location, 100 realizations of slide prevalence were drawn to reflect spatial and epidemiological uncertainty.
Fertility inputs were derived from the most recent available DHS or MIS surveys for each country. Where survey data were not available, we used data from a demographically similar neighbouring country matched by TFR. Fertility rates were stratified according to 5-year maternal age groups and gravidity and disaggregated according to urban or rural setting. Urban and rural classification was determined at 1-km2 resolution using WorldPop 2020 population density maps, applying a threshold of 386 persons per km² to define urban areas. This classification was then aggregated to the 5-km2 model resolution.
We used these inputs to simulate a synthetic cohort of pregnancies according to country, gravidity and location, calculating the expected distribution of Hb concentration under observed 2023 malaria transmission conditions. Estimates of moderate-to-severe anaemia prevalence, as well as the burden specifically attributable to malaria infection, were aggregated across all grid cells, weighted by the local fertility rate and population size, and then scaled to match WHO-estimated country-level pregnancy counts. These counts were based on United Nations-estimated births, adjusted for pregnancy losses (miscarriage, stillbirth and abortion) using standard multipliers.
To quantify the effect of declining malaria transmission on anaemia burden, we re-ran these simulations using the 2000 transmission levels (that is, pre-decline), drawing from the MAP 2000 posterior distribution of slide prevalence in children. This enabled a counterfactual comparison in which the same underlying population structure was retained but malaria exposure was elevated to historical levels. Comparing model outputs under the 2000 versus 2023 transmission scenarios provided estimates of the number of moderate-to-severe anaemia cases averted because of transmission reductions over the past two decades.
Finally, we incorporated the posterior distribution of IPTp-SP effectiveness derived from trial data (see above), applying the estimated mitigation fraction (\(\varepsilon\)) to malaria-attributable reductions in Hb. This allowed us to estimate the number of moderate-to-severe anaemia cases that would be averted under full IPTp-SP coverage. We then simulated the real-life impact of IPTp using WHO estimates of the coverage of these estimates (except for Kenya and Zimbabwe, which implements IPTp sub-nationally where we obtained coverage for relevant regions from the latest population-based surveys51,52) assuming a multiplicative per-dose effect.
Ethics and inclusion statement
This work builds on long-standing collaborations between modellers, clinical trialists, epidemiologists and malaria surveillance experts spanning more than a decade and involving institutions based in both malaria-endemic countries and high-income research settings. The research questions addressed in this study, including the need for robust estimates of malaria-attributable maternal anaemia and the impact of preventive interventions, were defined through these collaborations well before the present analysis and reflect priorities identified by local investigators and malaria control stakeholders. The individual-level data used in this study were derived from multi-country clinical trials and surveillance studies conducted in malaria-endemic settings, led or co-led by investigators based in those settings. Local researchers were involved in study design, data collection, data interpretation and authorship of the original studies, with data governance arrangements established at the time of data collection. Authorship in this manuscript reflects substantive intellectual contributions consistent with Nature Portfolio criteria. All primary studies contributing data received approval from relevant local and national ethics review committees and institutional review boards, as reported in the original publications. The present work is a secondary analysis of de-identified data and did not require additional ethical approval. Results are presented as aggregated and modelled estimates and do not pose risks of stigmatization or harm to participants.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.