Marked heterogeneity in malaria infection rate in a Malian longitudinal cohort

Categories: Disease & Virus

July 16, 2025

molFOI is not associated with age

We define the molFOI as the number of genetically distinct strains that infected a given participant over a period of time¹², where a newly incident infection by a genetically distinct strain was defined as observation of a novel amplicon haplotype (not seen in the previous two samples from the participant) at one or more of the four genotyped loci (Supplemental Fig. 1)¹². We calculated the molFOI for each participant over the entirety of the 2011 season (Fig. 2a). The molFOI range was 0–55 (mean: 11, median: 10, standard deviation: 9). We found no overall correlation between participant age and molFOI (quasi-Poisson regression, age coded as a continuous variable, P = 0.077), though infection rate may be lower in the youngest and oldest participants We also examined the number of clinical disease episodes per participant in the same 2011 cohort, using the number of malaria treatments as a proxy for clinical disease diagnoses (Fig. 2b). The range in number of treatments per participant is smaller than molFOI, ranging from 0 to 6 (mean: 2, median: 2, standard deviation: 1). We found a negative correlation between participant age and number of treatments (quasi-Poisson regression, P = 0.00073, β = −0.02). We directly compared molFOI and number of treatments per participant (Supplemental Fig. 3a). The number of treatments was significantly associated with molFOI (quasi-Poisson model, t(462) = 3.07, P = 0.0023), with inclusion of treatments accounting for 2.02% of deviance in 2011 molFOI values when compared to the base model, implying that disease rate explains only a small fraction of the observed variation in infection rate (deviance explained calculated as the difference between the deviance observed by this model and the intercept-only model, divided by the intercept-only model deviance). These observations suggest that the NAI that accumulates with age in this population⁹ reduces the risk of clinical disease but does not provide measurable protection against infection.

**Fig. 2: Molecular force of infection and number of malaria treatments in participants of all ages in 2011.**

Intra-individual molFOI is consistent across seasons

We next compared individual molFOI across transmission seasons to determine whether molFOI variation in 2011 was driven by stochastic factors, or whether differences in infection rate among participants are consistent over time. We hypothesized that while heterogeneity in infection rate might obscure age-associated differences when comparing age bins cross-sectionally, these differences might be evident when profiling individuals longitudinally. We estimated the cumulative molFOI over the subsequent five seasons (2012–2016) and compared that value to the value from 2011 for individual participants (Fig. 3a). We rejected the null hypothesis that variation in 2011 molFOI is stochastic; participants with high molFOI in 2011 generally had high molFOI in 2012-2016 as well (Spearman’s rank correlation, ρ(111) = 0.53 (95% CI = [0.38, 0.66]), P = 1.51e⁻⁹; linear regression, P = 3.4e⁻¹³, R² = 0.38). We performed the same analysis using symptomatic disease incidence (as estimated by number of malaria treatments) in place of molFOI in Fig. 3b. In contrast to molFOI, while the linear regression (adjusted R² = 0.03, P = 0.041) between number of treatments in 2011 and 2012–2016 was significant, Spearman’s rank correlation (ρ(111) = 0.17 (95% CI = [−0.02, 0.35]), P = 0.073) did not indicate the presence a significant correlation in symptomatic disease incidence across time periods. This contrast between infection and treatment rates suggests that molFOI may be a more durable estimator of individual infection risk, as it does not appear to be modified by age-dependent NAI.

**Fig. 3: molFOI varies by participant, not by age.**

We also stratified molFOI and treatment data for each participant by season, resulting in five molFOI values per participant (Supplemental Fig. 2). The individual molFOI-per-season values ranged from 0 to 106 (mean: 10, median: 6, standard deviation: 12). We found no significant differences in the overall distribution of molFOI values stratified by season, suggesting no major temporal or weather factors impacted any one season (Kruskal–Wallis test, χ²(4) = 4.36, P = 0.36). We visualized the patterns of average molFOI per age in each season (Fig. 3c). We found that a specific sub-cohort of participants that were age nine in 2012 consistently exhibited high average molFOI values over time, despite their increasing age and cumulative parasite exposure. We similarly visualized average treatment counts per age (Fig. 3d), but did not observe the same sub-cohort trend as with molFOI.

We further explored consistency in individual infection rate over time by fitting a linear mixed-effects model (Table 1) to these data, with participant age as a fixed effect and participant identity as a random effect, aiming to capture intra-individual changes in molFOI. In this model, fixed effects (participant age) had a pseudo-R² value of 0.00, while the pseudo-R² value for all effects was 0.50, further demonstrating the lack of relationship between age and infection rate. The mixed-effects model fit the data better than an age-only linear model (linear regression, adjusted R² = 0.076, P = 4.4e⁻¹²; models compared via likelihood rank test, χ²(1) = 170.69, P = 5.24e⁻³⁹), suggesting that participant identity is a more significant driver of variation in infection rate than age. We also fit linear mixed-effects models to subsets of this cohort by age (Table 1). In all of these models, age is not significantly associated with molFOI, and the pseudo-R² value for all effects is notably larger than the pseudo-R² value for age alone (models for participants of all ages, 0–4 years, 4–7 years, and 8+ years, respectively; pseudo-R² for age as a fixed effect: 0.00, 0.00, 0.01, 0.00 in each model, respectively, and pseudo-R² for all effects, including participant identity as a random effect: 0.50, 0.71, 0.37, 0.52, respectively).

Table 1 Mixed-effects models for molFOI show the consistency of intra-individual molFOI

We also examined infection status in May 2011 (at study enrollment) as a predictor of infection rate; 44% of participants (n = 202) were asymptomatically infected at enrollment. We found that molFOI from the 2011 season was significantly greater for participants who were infected at enrollment (Fig. 3e, mean molFOI for infected = 15.9; uninfected = 7.6; Kruskal–Wallis test, χ²(1) = 78.42, P = 8.33e⁻¹⁹). Again, we replicated this analysis with symptomatic disease incidence (as estimated by treatment counts) in place of molFOI (Fig. 3f). We found a significant difference for disease incidence as well (mean number of treatments for infected = 2.2, for uninfected = 1.9; Kruskal–Wallis test, χ²(1) = 7.85, P = 0.0051), though the range of this metric (0–6) is narrower than that of molFOI (0–55).

We compared infection rate to rates of symptomatic disease by evaluating molFOI with respect to the number of treatments per participant within a transmission season or the parasitemia of individual samples (Supplemental Fig. 3). We also hypothesized that lower density infections would correlate with lower molFOI, if participants had parasitemia near the level of detection of the genotyping assay. However, participants grouped by low, middle, or high molFOI in 2011 exhibited no significant differences in parasite densities (Kruskal–Wallis test, χ²(2) = 4.68, P = 0.096). In fact, participants with high molFOI in 2012-2016 had slightly lower densities (mean densities: 2.5e⁴ (low molFOI), 2.1e⁴ (mid molFOI), 2.4e⁴ (high molFOI); Kruskal–Wallis test, χ²(2) = 10.52, P = 0.0052; Dunn test with Holm correction, P = 0.0037 (low vs. high)), rejecting our null hypothesis. We speculate that individuals with more NAI may have lower parasite densities, and may be more likely to exhibit high molFOI due to reduced incidence of symptomatic disease and antimalarial treatment.

Host factors explain some heterogeneity in molFOI

We next hypothesized that host-specific factors could affect infection rate. We found that participant sex was not associated with molFOI in 2011 (Supplemental Fig. 4a, mean molFOI for females = 11.6, males = 11.1; Kruskal–Wallis test, χ²(1) = 2.47, P = 0.12) or 2012–2016 (Supplemental Fig. 4c, mean cumulative molFOI for females = 47.7, males = 54.0; Kruskal–Wallis test, χ²(1) = 0.77, P = 0.38). Similarly, we stratified symptomatic disease incidence by participant sex, finding that participant sex was also not associated with symptomatic disease counts in either 2011 (Supplemental Fig. 4b, mean cumulative number of treatments for females = 2.1, males = 1.9; Kruskal–Wallis test, χ²(1) = 1.17, P = 0.28) or 2012–2016 (Supplemental Fig. 4d, mean cumulative number of treatments for females = 4.4, males = 4.7; Kruskal–Wallis test, χ²(1) = 0.15, P = 0.70).

We also considered participant ethnicity, bednet usage, and socioeconomic factors¹⁹ as potential explanations for molFOI heterogeneity and temporal consistency²⁰. However, neither ethnicity nor bednet usage varied significantly among participants. The majority of participants in both the 2011 and 2012–2016 cohorts are of the Bambara ethnic group (418/464 participants or 90% in 2011 and 105/120 participants in 2012–2016, or 87.5%). Participants nearly universally reported daily bednet usage when surveyed in 2013 (561/563 households owned bednets and 559/563 households reported daily usage). We used roof type as a proxy for socioeconomic status, as in prior studies of this cohort¹⁷, where metal roofs are a proxy for relative wealth. In the 2011 cohort, we found a significant difference in molFOI between participants living in homes with flat roofs (n = 143, mean molFOI = 14.1) vs. those from homes with metal roofs (n = 269, mean molFOI = 10.5; Kruskal–Wallis test, χ²(1) = 8.12, P = 0.004). We observed a similar trend in the 2012-2016 data between participants from homes with flat roofs (n = 37, mean molFOI = 58.6) vs. participants from homes with metal roofs (n = 76, mean molFOI = 49.8), but the difference was not statistically significant (Kruskal–Wallis test, χ²(1) = 1.64, P = 0.20).

Given that many individuals in this population carry malaria-protective variant alleles at the hemoglobin subunit beta (HBB) locus¹⁷, we examined molFOI stratified by HBB genotype (Fig. 4, stratifications of cohorts by HBB genotype and age or sex in Supplemental Table 1). Of the 464 participants in our 2011 cohort, 372 were homozygous for the ancestral allele, A (HbAA genotype). A total of 43 participants were heterozygous for the S allele (HbAS genotype), which confers the sickle cell trait and has been associated with lower risk of both uncomplicated and severe malaria disease^21,22,23. A total of 47 participants were heterozygous for the C allele (HbAC), a genotype that has not been as clearly associated with protection from uncomplicated malaria as the HbAS genotype²¹ Due to the low sample size, we excluded two participants with HbCC and HbSC genotypes from the analysis in Fig. 4a only. We found significant differences in the molFOI in 2011 for participants with HbAS compared to both HbAA and HbAC genotypes (Fig. 4a, mean molFOI for HbAA (11.8), for HbAC (13.1), for HbAS (5.3). Kruskal–Wallis test, χ²(2) = 23.53, P = 7.8e⁻⁶, ε² = 0.05. Dunn test with Holm adjustment, P = 7.6e⁻⁶ (HbAA vs. HbAS), 9.1e⁻⁵ (HbAC vs. HbAS), 0.51 (HbAA vs. HbAC)). In the 2012–2016 cohort, which had fewer participants (n = 120) and thus lower statistical power compared to the 2011 cohort, fewer participants had variant genotypes (6 HbAC and 17 HbAS); molFOI values were lower in HbAC (mean molFOI: 45.5) and HbAS (mean molFOI: 34.3) participants than in HbAA participants (mean molFOI: 54.3), but they did not reach the threshold of significant difference (Fig. 4c; Kruskal–Wallis test, χ²(2) = 0.96, P = 0.62). We also examined symptomatic disease stratified by HBB genotype (Fig. 4). We found significant differences in the number of treatments for individuals with HbAS genotype compared to HbAA in both the 2011 cohort (Fig. 4b, Kruskal-Wallis test, χ²(2) = 6.55, P = 0.038, ε² = 0.014. Dunn test with Holm adjustment, P = 0.032 (HbAA vs. HbAS), 0.12 (HbAC vs. HbAS), 0.93 (HbAA vs. HbAC) and the 2012-2016 cohort (Fig. 4d, (Kruskal–Wallis test, χ²(2) = 12.58, P = 0.0019, ε² = 0.11. Dunn test with Holm adjustment, P = 0.0016 (HbAA vs. HbAS), 0.36 (HbAC vs. HbAS), 0.53 (HbAA vs. HbAC)).

**Fig. 4: Host genotype explains some variability in molFOI and malaria treatment incidence.**

Additionally, we considered other potential effects that could be associated with HBB genotype (Supplemental Fig. 5). Heterozygous HBB genotypes did not cluster into a particular region of Kalifabougou (Supplemental Fig. 5a), nor did they have a different range in participant ages than HbAA (Supplemental Fig. 5b, Kruskal–Wallis test, χ²(2) = 1.84, P = 0.40). We also did not find a difference in parasite density for HbAS individuals, which could have potentially biased our ability to detect molFOI (Supplemental Fig. 5c, Kruskal–Wallis test, χ²(2) = 2.45, P = 0.29), though the sample size of available parasitemia data after stratification by HBB genotype is small.

We fit an additional linear mixed-effects model to the 2012–2016 molFOI data incorporating these additional host factors in addition to age, with participant identity as a fixed effect (Table 1). In this model, none of the fixed effects have a significant effect on molFOI on their own (P = 0.26, 0.47, 0.60, 0.09, 0.41 for age, roof type, HbAC genotype, HbAS genotype, and sex, respectively). As a whole, the fixed effects contributed a small amount to the model’s total explanatory power; pseudo-R² for fixed effects alone was 0.03, and pseudo-R² for all effects was 0.52.

Geographic features may explain some heterogeneity in molFOI

We next considered whether geographic factors could affect infection rate. First, we analyzed the spatial autocorrelation of the 2011 molFOI values (Fig. 5a), to look for general associations between household location and molFOI. We found a very small negative dispersion of molFOI (Moran’s I = −0.023, pseudo-P = 0.002, two-sided hypothesis, compared to a Monte-Carlo simulation of permutations of the data), suggesting that micro-geographic features do not drive locally similar molFOI values. We found a positive correlation between the distance from participants’ homes to the Kalifabougou study clinic and their molFOI in 2011 (Fig. 5b, linear regression, adjusted R² = 0.089, P = 4.3e⁻⁸; quasi-Poisson regression, t(312) = 5.71, P = 2.68e⁻⁸). We also found a positive correlation between participants’ distance to the closest part of the river system and their 2011 molFOI (Fig. 5c, linear regression, adjusted R² = 0.025, P = 0.0031; quasi-Poisson regression, t(312) = 3.00, P = 0.003). Taken together, these spatial analyses suggest that geographic differences may have a slight effect, but they do not explain the majority of the observed heterogeneity in infection rate in this cohort (2.45% and 8.30% of deviance explained by distance to clinic and rivers, respectively).

**Fig. 5: molFOI does not concentrate in geographic hotspots.**

Several serological features distinguish between low and high infection rate participants

Systems serological profiling was previously performed on serum from 201 participants included in our sequencing study¹⁸. These data were generated using samples from May 2011, at enrollment into the study and before the transmission season began that year, with the objective of defining potential correlates of protection. Antigen-specific IgG1, IgG2, IgG3, IgG4, IgA1, and IgM to AMA1, full-length CSP, MSP1, RH5 and the N- and C-terminal domains of CSP were quantified. The functional potential of these antigen-specific antibodies was also captured, including the ability of the antigen-specific antibodies to bind to Fc receptors (FcRn, FcγRIIAH, FcγRIIAR, FcγRIIB, FcγRIIIAF, FcγRIIIAV, and FcγRIIIB) and recruit antibody-dependent complement deposition, antibody-dependent cellular phagocytosis, and antibody-dependent neutrophil phagocytosis. While the evolution of antigen-specific neutrophil activation was previously found to be correlated with rates of disease severity in this cohort¹⁸, here we aimed to define whether malaria-specific serological markers were positively or negatively associated with infection rate, as estimated by molFOI.

We set thresholds for molFOI in 2011 to create “low molFOI” (<4) and “high molFOI” (>13) groups, which correspond to the lowest and highest 12% of the molFOI distribution (Fig. 6a). To explore the sensitivity of the analysis to threshold used, we defined two additional different sets of thresholds, to consider the upper and lower 33% and 25% of the molFOI distribution (Supplemental Table 2). We primarily discuss results from the most stringent set of thresholds here; the others are highly concordant and are described in Supplemental Figs. 6 and 7. For each set of molFOI cutoffs, we compared each serological feature between low and high molFOI groups using Mann–Whitney tests with Benjamini–Hochberg corrections. In all three approaches to molFOI stratification, Fcγ receptor binding to the highly polymorphic c-terminal region of CSP (specifically, Fcγ receptor IIIAV, or FcγRIIIAV), as well as Fcγ receptor binding to RH5 (FcγRIIIAV), were enriched in participants with low molFOI (Mann–Whitney test with Benjamini–Hochberg correction, P = 0.015 and 0.030, respectively; Fig. 6c, d), suggesting that these may be protective against infection. While only these two features were significant in all three molFOI stratifications, 13 additional features were enriched in individuals with high molFOI in two of the three stratifications, including IgG1 specific to AMA1, similar to previous studies suggesting that it may be a marker of infection history^24,25,26.