Continental-scale genomic surveillance of Plasmodium falciparum malaria across sub-Saharan Africa with rapid nanopore sequencing

Categories: Disease & Virus

May 11, 2026

Sample collection and ethics

All blood samples were collected from patients with P. falciparum malaria, with informed consent from the patient or from a parent or guardian. In all cases, consent allowed for the samples to be used for purposes such as this study. This study was conducted using the collected samples only; there was no human subject contact.

All studies received ethical approval from an appropriate research ethics committee: in Senegal, ethical approval for the SEN19/49 study was granted by the Comité National d’Ethique pour la Recherche en Santé (000317/MSAS/CNERS/SP); in Torodo, Mali, ethical approval for a cohort study in 2022 was granted by the Ethics Committee of Charité (EA2/264/21) and the Faculty of Medicine, Pharmacy and Odontostomatology (FMPOS) at the University of Bamako (N^∘2022/20/CE/USTTB/24.01.22); in Burkina Faso, ethical approval for the AMTIP study was granted by the Comité d’Ethique Institutionnel/Institut National de Santé Publique (2023-10/MSHP/SG/INSP/CEI); in Côte d’Ivoire, ethical approval for the study “Surveillance génomique de Plasmodium falciparum aux CTA à Bouaké, Côte d’ivoire” was granted by the Direction Médicale et Scientifique du aux Centre Hospitalier et Universitaire (192MSHPCMU/CHU-B/DG/DMS/ONAR/24) and by the Ethikkomission der Charité Universitätsmedizin (EA2/171/24); in Nigeria, ethical approval for samples collected in 2023–2024 was granted by the Ethical Committee of Ladoke Akintola University of Technology Teaching Hospital, Ogbomoso (LTH/OGB/EC/2022/304); in Zambia, ethical approval for the 2024 Malaria Indicator Survey (MIS2024) and the 2024–2025 hrp2/3 Surveillance Study was granted by the Research Ethics Committee at the University of Zambia (5055-20241) and the Tropical Diseases Research Centre Ethics Review Committee (TRC/C4//03/2024), respectively; in Kenya, ethical approval for samples collected in the ATSB study was granted by the KEMRI Scientific and Ethics Review Unit (KEMRI/SERU/CGHR/368/4189; CDC Project ID 0900f3eb81d7ec3c and 0900f3eb82546323); in Ethiopia, ethical approval for the ARSUNA study was granted by the Arsi University institutional review board (AU/HSC/ST-129/5494) and the Federal Democratic Republic of Ethiopia, Ministry of Education (17/256/476/24).

Mock sample creation

Making DBS from culture

For validation of our protocol, we created mock P. falciparum positive dried-blood spots (DBS) as follows. We cultured the P. falciparum laboratory strains 3D7, Dd2 and HB3 at 5% haematocrit in commercially available red blood cells (RBCs) obtained from DRK-Blutspendedienst Nord-Ost gGmbH⁴⁴. We brought each culture to ~5% parasitemia and then synchronised to ring-stage parasites using 5% Sorbitol (PanReac AppliChem, #A2222). After synchronisation, three technicians independently measured the parasitemia of each culture by microscopy and used a Neubauer Counting Chamber to determine the number of RBCs per microlitre. We used the average of the parasitemia and RBCs/μL measurements to standardise each culture to 100,000 parasites/μL at 50% haematocrit. We performed 10-fold serial dilutions of the 100,000 parasites/μL stocks in 80 μL of whole human blood to produce clonal mock samples down to 1 parasites/μL. To explore sequencing performance on polyclonal infections with low-frequency minor clones, we created mock samples containing two laboratory strains at different proportions. In particular, we combined clonal 10,000 parasites/μL dilutions of two strains to create 80:20 mixtures in 120 μL, and then performed two-fold serial dilutions into the major strain to produce 90:10, 95:5, and 97.5:2.5 mixtures. These were further serially diluted into whole human blood to produce 1000 parasites/μL and 100 parasites/μL parasitemia mixtures. To produce the DBS, we created five 20 μL spots for each mock sample on an individual filter paper (Whatman, #10531018) and dried them overnight at room temperature (Supplementary Fig. 4a). The DBS were stored at −20 °C in individual plastic bags with a desiccant until use.

DNA extraction from DBS

For each sample, we used a single (6 mm) punch from the DBS for DNA extraction. DNA was extracted by using QIAamp DNA Mini Kit (Qiagen, #51306) according to the manufacturer’s instructions with the following modifications: elution was performed by using 2 × 40 μL of nuclease-free water heated to 50 °C. Each elution was incubated for 3 min at room temperature.

Mock samples containing kelch13 mutations

To interrogate artemisinin-resistance mutations, we ordered P. falciparum genomic DNA for Cambodian field-derived strains IPC 5202 (kelch13 R539T), IPC 4912 (kelch13 I543T), IPC 3445 (kelch13 C580Y)⁴⁵ from BEI resources (www.beiresources.org). We created 10,000 parasites/μL in vitro DNA mixtures by diluting these stocks to 0.25 ng/μL in 25 ng/μL human genomic DNA (Roche #11691112001).

NOMADS-MVP multiplex PCR

Primer design

We designed primers for a novel amplicon panel using our open-source multiplex PCR primer design software, Multiply³² (https://github.com/JasonAHendry/multiply). We targeted nine genes that would jointly provide information on antimalarial drug resistance, hrp2/3 deletions, vaccine target evolution and complexity of infection (COI) (Table 1). Each gene was targeted with a single amplicon, except mdr1 for which separate N- and C-terminal amplicons were designed. For the ten targets, Multiply produced a total of 453 candidate primer pairs (median 47 per target); across which it identified 4096 potential offtarget-binding sites, 77 primers overlapping common variants (>5% minor allele frequency (MAF) in any population defined in the Pf6 Project⁴⁶ or drug resistance-associated codons), and 2028 high-affinity primer dimers. The greedy search algorithm minimised these factors to select a set of 3 candidate multiplexes (from a theoretical 22 thousand trillion possible combinations) for the ten targets. After pilot sequencing experiments, we selected the multiplex PCR with the highest coverage over all ten targets as our final panel (Supplementary Fig. 1d). The included amplicons vary in size from 604 to 1405 bp and amplify a combined 10,860 bp in a single reaction. A total of 45 validated and candidate World Health Organization (WHO) antimalarial drug resistance-associated single-nucleotide polymorphisms (SNPs) are genotyped (Table 1). The amplicons targeting hrp2 and hrp3 both have reverse primers annealing in exon 1, and forward primers upstream of exon 2, which enables detection of partial or complete hrp gene deletions⁴⁷. The csp amplicon spans from codon 19 to the end of the gene, which includes the entire central repeat region and C-terminal domain used in the RTS,S/AS01 and R21/Matrix-M vaccines^48,49. A high diversity, 826 bp window of apical membrane antigen 1 (ama1) is covered to support COI estimation.

PCR optimisation

We optimised our multiplex PCR using KAPA HiFi HotStart ReadyMix (Roche Diagnostics, #KK2602) formulation, which performs well for (A+T)-rich genomes like P. falciparum. Using mock samples of varying parasitemia, we searched for a reaction optimum in several full factorial experimental designs. For simplicity, we maintained a 2-step PCR programme and 25 μL reaction volume throughout, but explored varying 5 other factors (template DNA amount, number of cycles, primer concentration, polymerase concentration, and annealing temperature). By agarose gel, we observed that increasing the template amount increased sensitivity up to 8 μL after which there was no clear benefit; and that a programme with 35 cycles was more sensitive than 30 cycles (Supplementary Fig. 1a). A gradient PCR indicated an ideal annealing temperature of between 58–60 °C (Supplementary Fig. 1b). Increasing the amount of primer increased sensitivity but also increased visible background amplification, whereas increasing the amount of polymerase seemed to increase sensitivity without visibly affecting background (Supplementary Fig. 1c). Using this information, we selected four candidate optimal conditions, and sequenced seven mock samples under each (Supplementary Fig. 1d). From these, we selected the multiplex PCR conditions to maximise the percentage of reads mapping to P. falciparum malaria and mean coverage across amplicons. The final conditions were: 95 °C, 3 min; followed by 35 cycles of 98 °C 20 s, 60 °C 3 min; and a final at 60 °C for 10 min.

NOMADS-MVP sequencing protocol

The complete laboratory protocol, including materials and primer sequences, is available online at protocols.io (English: NOMADS-MVP: Rapid Genomic Surveillance of Malaria; French: Surveillance Génomique du Paludisme par la méthode Nanopore: Protocole Rapide NOMADS-MVP). In brief, 8 μL of extracted DNA is used as template in a 25 μL multiplex PCR using 15.5 μL of Kapa HiFi HotStart ReadyMix (Roche Diagnostics, #KK2602) and 1.5 μL of the NOMADS-MVP primer pool. Multiplex PCR products are cleaned using a 0.5X ratio of AMPure XP Beads (Beckman Coulter, A63881) and eluted in 15 μL of nuclease-free water. DNA elute is quantified using the Qubit dsDNA HS Assay Kit (ThermoFisher Scientific, #Q33231) and between 200 and 800 ng of DNA is taken forward for barcoding and sequencing. We use Rapid Barcoding Kit 96 (SQK-RBK114.96) from Oxford Nanopore Technologies (ONT) following the associated ONT protocol with the following exceptions: we use 1 μL (rather than 1.5 μL) of each barcode during rapid barcoding; after barcoding we purify the pooled samples at a 0.5× ratio and elute in 15 μL; we use 800 ng of DNA for adaptor ligation, with a master mix of 1.0 μL of RA and 2.3 μL ABD (rather than 1.5 μL RA, 3.5 μL ADB). All sequencing was done using R10.4.1 flow cells on MinION Mk1B or Mk1D devices.

Exclusion of selective whole genome amplification (sWGA)

We initially explored two versions of the protocol: one that included a selective whole genome amplification (sWGA)⁵⁰ step to preamplify bulk P. falciparum DNA, and one where the multiplex PCR was performed directly on extracted DNA. To this end, we developed a modified version of the selective whole-genome amplification (sWGA) protocol⁵⁰ that substituted the phi29 DNA polymerase (New England Biolabs, #M0269S) with EquiPhi29 (ThermoFisher Scientific, #A39390) as follows: 2 μLEquiPhi29 10x buffer, 0.2 μL 0.1 M DTT, 2 μL 500 μM sWGA primer pool, 1 EquiPhi29 DNA polymerase, 12.8 μL DNA; 45 °C 1 h, 60 °C 10 min; dilute before PCR with 160 μLnuclease-free water. We were able to obtain robust amplification with these conditions despite the substantial reduction in incubation time relative to phi29 DNA polymerase. We hypothesised that including sWGA might increase sensitivity, although at additional protocol cost, complexity and time. While this was confirmed with mock DBS samples created from laboratory strains, we saw no benefit in field DBS samples (Supplementary Fig. 10) and discontinued use of sWGA in August 2024.

Real-time bioinformatics pipeline and dashboard

We developed a real-time bioinformatics pipeline and dashboard for nanopore sequencing, called Nomadic. The source code is publicly available on GitHub (https://github.com/JasonAHendry/nomadic) and user documentation is hosted on GitHub Pages (https://jasonahendry.github.io/nomadic/). Nomadic was designed for P. falciparum genomic surveillance with the NOMADS-MVP protocol, but was coded flexibly and supports other amplicon panels or organisms. Both the bioinformatics pipeline and dashboard are implemented in Python.

Running a real-time analysis

Briefly, to run Nomadic (v0.5.0), a user first starts a nanopore sequencing run using MinKNOW, ensuring to: (1) assign a unique experiment name, hereafter referred to as ; (2) enable real-time basecalling using either the High Accuracy (HAC) or Super Accurate (SUP) model; (3) enable sample demultiplexing with the appropriate barcoding kit (for NOMADS-MVP, SQK-RBK114.96). The user must prepare a comma-separated values (CSV) metadata table, which maps the barcodes to sample identifiers, and save it in a Nomadic workspace folder with the name .csv. The user can then launch Nomadic from a terminal window with the command nomadic realtime .

Implementation and bioinformatics details

As sequencing proceeds, MinKNOW writes batches of basecalled reads to disk every ten minutes as FASTQ files. These are demultiplexed by MinKNOW such that each barcode possesses its own folder containing associated FASTQ files. Nomadic keeps track of the FASTQ files in these folders and, when a new FASTQ file has been generated for a given barcode, launches a per-sample bioinformatics pipeline to process it and update results for the sample. In brief, the pipeline maps reads to the P. falciparum 3D7 reference genome (PlasmoDB release 67) using Minimap2⁵¹ (v2.28), summarises the output of mapping (i.e., fraction of reads mapped) and coverage of the target amplicons using samtools (v1.17), performs variant calling with bcftools call⁵² (v1.17) or Delve (v0.2.0), and annotates variants using bcftools csq (v1.17). Once a per-sample bioinformatics pipeline has been completed, a set of summary CSV files describing the results across all samples in the experiment is updated to reflect the incorporation of the new reads.

The dashboard runs in a separate thread and creates a graphical summary of these summary CSV files, which updates every minute. The plots are interactive, and users can hover over features of interest to open tooltips with additional information, zoom into areas of interest, or export static versions of the plots as PNG files. We implemented it using the Dash library (v2.17.1) from plotly. In our experience, sequencing and basecalling are often slower than the time Nomadic takes to process batches of FASTQ files, which allows the dashboard to remain up-to-date throughout the sequencing run. Once sequencing is completed, MinKNOW and Nomadic are stopped by the user. The final summary CSV files contain key results and are sufficient to reopen the dashboard, as described below.

Viewing the dashboard for a previous experiment

The Nomadic dashboard can be reopened after a sequencing experiment is completed by running the command nomadic dashboard in a terminal window.

Sequencing coverage summary statistics

To evaluate sequencing coverage across countries and parasitemia levels, we calculated two per-sample metrics: (1) the mean coverage across all amplicons; (2) the fold-difference in coverage between the most- and least-abundant amplicon (excluding the hrp2/3 amplicons), a measure of uniformity where a value of 1 indicates perfectly uniform coverage across amplicons. We computed these values using the mean_cov column in the summary.bedcov.csv files produced by Nomadic. More formally, for the per-sample fold-difference in coverage, we define m_ij as the mean coverage in sample i of amplicon j, and a as the set of all amplicons excluding hrp2 and hrp3, then we compute:

$$\frac{{\max }_{j\in a}({m}_{ij})}{{\min }_{j\in a}({m}_{ij})}.$$

(1)

Excluding hrp2 and hrp3 makes the statistic robust to hrp2/3 deletions; in practice, for NOMADS-MVP, hrp2 or hrp3 are rarely the lowest abundance amplicon in the absence of deletion and so do not contribute to the statistic’s value.

Analysis of read lengths

As each experiment typically contains millions of reads, for computational simplicity, we included only four experiments in this analysis: from Kenya (sequenced on 2024/07/30), Mali (2024/10/25), Ethiopia (2024/11/27) and a mock sample experiment (2024/11/26). For each experiment, we randomly sampled 20 barcodes (samples) without replacement from those that: (i) had 8 or more amplicons with greater than 50× coverage; (ii) had >100 parasites/μL; (iii) did not use sWGA; (iv) were not positive or negative controls. For these barcodes, the BAM file generated by Nomadic was processed using Pysam (v0.22.1) to determine read lengths from the ‘query length’ field. Genomic regions of interest were delineated with a custom Python script that converted gene codon numbers of interest into genomic coordinates using the P. falciparum 3D7 reference genome (PlasmoDB release 67) and associated general feature format (GFF) file, and then were manually verified using the Integrative Genome Viewer (IGV) (v2.5.0). We determined the percentage of reads overlapping by first using the intersect command from bedtools⁵³ (v2.31), which enabled us to filter the BAM file to only reads overlapping the region of interest, and then comparing the number of mapped reads in this BAM file against the original BAM file using samtools (v1.17).

Inference of hrp2/3 deletions

We used a set of 219 previously published samples from Asella, Ethiopia, to evaluate concordance between NOMADS-MVP and a gold-standard approach for hrp2/3 deletion detection³⁴. Recommended⁵⁴ conventional PCRs for hrp2⁵⁵ and hrp3⁵⁵ were performed in duplicate and visualised by agarose gel electrophoresis. For discordant results, we called a sample positive if there was a clear band of the appropriate length in at least one of the replicates⁵⁴. We excluded from analysis 46 samples that failed a kelch13 PCR, suggesting low DNA quality or parasitemia, and 24 samples that contained other Plasmodium species, leaving 149 samples for analysis. Of the 149 samples, 138 (92.6%) had 8 or more amplicons exceeding 50× coverage after NOMADS-MVP nanopore sequencing and were used for the comparison.

We predicted hrp2/3 deletions from NOMADS-MVP sequencing data using a previously developed Bayesian statistical model³². In brief, it works as follows. For each sequencing run, the statistical model first estimates the rate of background contamination/barcode misclassification using the included negative controls. Next, it estimates the quality of each sample and average variation in quality across samples using the mean coverage over all amplicons in the multiplex PCR, excluding the hrp2 and hrp3 amplicons. Finally, across all samples, the posterior probability of hrp2 and hrp3 deletion is estimated separately using Markov Chain Monte Carlo (MCMC). Here, each chain was run for 50,000 iterations with a prior deletion probability of 0.5 for hrp2 and hrp3. Although a lower prior might be justified for hrp2, given the prevalence of deletions in the region, in practice our estimates were insensitive to the prior. For Fig. 4a, we directly show the posterior probabilities from the model. For Fig. 4b, we are showing the maximum a posteriori (MAP) estimate of deletion. The statistical model and Bayesian MCMC were implemented in Python.

Variant calling

We developed a novel variant caller, named Delve, to enable the detection of biallelic SNPs at low frequencies in polyclonal infections. Delve (v0.1.0) was implemented in Rust and is publicly available on GitHub (https://github.com/berndbohmeier/delve). Below, we describe the statistical model, genotype inference, bias filtering, and model tuning. The notation used throughout this section is summarised in Table 2.

Table 2 Notation for Delve

Statistical model

Delve assumes genomic sites are independent and calls SNPs at one site at a time. Reads overlapping a given site are indexed by j = 1, 2, …, D, where D is the sequencing depth at the site. Each read contributes a base, b_j, which was aligned to the reference base REF ∈ {A, T, C, G} during read mapping, and an associated Phred-scaled base quality score, q_j. Delve assumes sites are biallelic, and for each site retains only the reads carrying the reference base REF and the most common alternative base, ALT ∈ {A, T, C, G} − REF; all other bases are treated as errors and discarded. Quality scores are taken from the basecaller, and then further adjusted using the Base Alignment Quality (BAQ) algorithm implemented in samtools⁵² to account for local alignment ambiguity. Afterward, bases that have a quality score below the fixed threshold ${Q}_{\min }$ are discarded. If the depth after filtering is less than ${D}_{\min }$, no SNP call is made at the site.

A P. falciparum infection can consist of multiple clones at unknown proportions and, as a result, the fraction of clones carrying the alternative base, ν, can take on any value in the interval [0, 1]. We are interested in determining ν from the sequencing data. Assuming that reads are independent, the likelihood for ν is given by:

$${{{\mathscr{L}}}}(\nu )=\mathop{\prod }_{j=1}^{D}{[(1-\nu )(1-{\epsilon }_{j})+\nu {\epsilon }_{j}]}^{{x}_{j}}{[\nu (1-{\epsilon }_{j})+(1-\nu ){\epsilon }_{j}]}^{1-{x}_{j}},$$

(2)

where x_j = 1 if b_j = REF, and x_j = 0 if b_j = ALT; and ϵ_j are the base error probabilities computed from the Phred-scaled quality scores, ${\epsilon }_{j}=-10{\log }_{10}({q}_{j})$.

This likelihood is a more general form of the genotype likelihood described by Li⁵⁶ (implemented in bcftools) as well as by McKenna et al.⁵⁷ (implemented in GATK). In those cases, the likelihood is parameterised in terms of a fixed genotype ploidy, represented by an integer. For example, in the diploid case, an organism can carry 0, 1, or 2 copies of the alternative allele, which, in the parameterisation above, corresponds exactly to ν = 0, ν = 0.5 or ν = 1.0. The key difference is that in this formulation, we allow ν to vary continuously to reflect the unknown fraction of P. falciparum clones carrying the alternative base. For numerical stability, the likelihood is always evaluated in logarithmic space, yielding the log-likelihood ℓ(ν).

Genotype inference

The goal of inference is to learn both ν, the fraction of clones carrying the alternative base, and also to use this ν to assign a genotype for the site. First, we determine the maximum likelihood estimate of ν, ${\hat{\nu }}_{{{{\rm{MLE}}}}}$, using Brent’s method bounded inside the interval [0, 1]. We then use the likelihood ratio test (LRT) to evaluate the evidence for a SNP at the site. In particular, we test the null hypothesis that there is no variant, H₀: ν≤ν₀, versus the alternative hypothesis that there is a variant, H₁: ν > ν₀. In this way, to call a variant, we require that the ${\hat{\nu }}_{{{{\rm{MLE}}}}}$ exceeds a small cut-off frequency ν₀; in the absence of such a cut-off (i.e., if ν₀ = 0), even the most subtle sequencing biases could eventually lead to a rejection of the null hypothesis, and spurious calling of a variant, as arbitrarily more reads were sequenced. Given these hypotheses, the LRT statistic is

$${\lambda }_{{{LRT}}}=-2[\ell ({\hat{\nu} }_{0})-\ell ({\hat{\nu }}_{{{MLE}}})],$$

(3)

where ${\hat{\nu }_{0}}$ is the maximum likelihood estimate (MLE) for the alternative allele frequency ν, given the restriction ν ≤ ν₀. We reject H₀ if λ_LRT is greater than a threshold c_LRT, which is set during model tuning. If the null hypothesis is rejected, we conduct a further likelihood ratio to distinguish between a homozygous alternative SNP, H₀: v≥(1−ν₀), versus a heterozgyous alternative SNP, H₁: v ν₀).

SNP filtering

Delve filters candidate SNPs if they exhibit excessive strand bias. In particular, we implemented the StrandOddsRatio test used in GATK⁵⁷, with minor modifications. For each candidate SNP, we count the number of reference bases on the forward (N_r+) and reverse strand (N_r−), and the number of alternative bases on the forward (N_a+) and reverse (N_a−) strand. Using these counts, we calculate the odds ratio:

$$R=\frac{{N}_{r+}\cdot {N}_{a-}}{{N}_{r-}\cdot {N}_{a+}}.$$

(4)

s Following GATK, we make the ratio symmetric by calculating $ORT=R+\frac{1}{R}$.

A candidate SNP is filtered from the final SNP set if ORT exceeds a threshold c_ORT. The rapid barcoding kit (SQK-RBK114.96) introduces strand coverage bias at amplicon edges, with the majority of coverage at the 5’ and 3’ ends of the amplicons deriving from the reverse and forward strands, respectively. This is an expected consequence of the transposome inserting barcodes uniformly across the amplicon and sequencing proceeding in a 3’ to 5’ direction. To accommodate this, we only apply the ORT filter to SNPs that have a heterozygous alternative genotype call; in this way, avoiding filtering homozygous alternative SNPs with strand coverage bias due to the barcoding chemistry.

Model tuning

In total, Delve is tuned by five parameters (Table 2). We set the base-quality filter ${Q}_{\min }$ by examining the distribution of quality scores after applying the BAQ algorithm, observing a long tail of lower-quality bases beginning at Q = 20 and comprising approximately 14% of all bases. We explored varying v₀, c_LRT, c_ORT to maximise recall, under the constraint of maintaining near-perfect precision in the 1000 and 10,000 parasites/μL mock DBS samples. The minimum depth filter ${D}_{\min }$ was disabled during the downsampling analysis. Otherwise, we set ${D}_{\min }=20$; although in principle, the c_LRT threshold already implicitly sets a threshold on the amount of sequencing depth (and base qualities) required to make a variant call.

SNP calling accuracy evaluation

Creating a set of true variants

3D7, Dd2, HB3 have been sequenced using Pacific Bioscience Sequencing SMRT technology and high-quality FASTA sequences are available on PlasmoDB⁵⁸. To identify variants in these assemblies with respect to the 3D7 reference genome, we simulated high-quality (Phred 60) error-free reads in silico from the FASTA files, mapped them to the 3D7 reference genome with minimap2 (v2.28), and then identified variants using the bcftools⁵² (v1.22) mpileup and call commands. We simulated 60 error-free reads, half forward and half reverse strand, for each amplicon in NOMADS-MVP by extracting the FASTA sequence spanning ±4 kbp of the target, based on GFF files for 3D7, Dd2 and HB3. This procedure resulted in true VCF files for each of the clonal strains. We created truth VCF files for the two-strain mixtures by combining allelic depth information from the clonal strain VCF files at the correct proportions given the laboratory mixture, and updating the genotype call accordingly.

For each Cambodian strain, information about the Kelch13 amino acid changes were available from BEI resources (IPC3445, C580Y; IPC4912, I543T; IPC5202, R539T); however, we found no information about corresponding nucleotide changes. Therefore, we performed Sanger sequencing using the NOMADS-MVP kelch13 amplicon as a singleplex PCR to determine the kelch13 nucleotide sequence for each strain. We used tracy⁵⁹ to convert the chromatogram file into a VCF with the tracy decompose command.

Evaluating accuracy

For all SNP calling accuracy evaluations, we used the Super Accurate (SUP) basecalling model in MinKNOW (v24.06.16) to generate FASTQ files, and then ran Nomadic in real-time to map reads and produce BAM files. We called variants using both the bcftools (v1.22) call command and Delve (v0.5.0). We used hap.py⁶⁰ from a Docker image (jmcdani20/hap.py:v0.3.12) to compute measures of variant calling accuracy in comparison to the truth VCF files described above. In particular, we used three measures of SNP calling performance: (i) the precision, TP/(TP + FP), where TP is the number of true positive SNP calls (homozygous or heterozygous alternative), and FP is the number of false-positive SNP calls; (ii) the recall of true homozygous alternative SNPs, TP_homo/(TP_homo + FN_homo), where TP_homo and FN_homo are the number of true homozygous alternative SNPs called and missed, respectively; (iii) the recall of true heterozgyous alternative SNPs, TP_het/(TP_het + FN_het), which is defined analogously with (ii). We excluded from the analysis: (i) low-complexity regions of the Pf3D7 reference genome, identified using sdust with default parameters; (ii) the central repeat region of csp; (iii) the hrp2 and hrp3 amplicons, as their primary utility is for deletion identification and they consist largely of low-complexity repetitive sequence. This left a total of 6,681 bp for evaluation in each mock DBS sample. For amplicons where no variants existed for a particular mock sample (e.g. Kelch13 for 3D7, Dd2 and HB3), the precision and recall are considered to be undefined; unless false-positive mutations are present, in which case the precision is zero. The within-sample alternative allele frequencies (WSAFs) plotted in Fig. 3a, c were calculated directly from the allelic depths.

Downsampling analysis

For the downsampling analysis, we included all mock DBS samples that: (i) were at 10,000 or 1000 parasites/μL; (ii) had greater than 500× coverage for all amplicons. This set included a total of 72 mock DBS samples (33 with 1000 parasites/μL and 39 with 10,000 parasites/μL). We excluded the mock DBS samples with 100 parasites/μL because the majority (62.2%, 28/45) did not have greater than 500× coverage for all amplicons. For each mock DBS sample, we randomly downsampled the sequencing reads to a specific mean coverage for each amplicon, using samtools⁵⁶ (v1.22). To achieve this, we split each mock sample’s BAM file into 10 per-amplicon BAM files, each containing reads mapping to one of the NOMADS-MVP targets. We then downsampled these per-amplicon BAM files independently, with the samtools view command and -s/–subsample flag, to achieve a target mean coverage level, before concatenating them back into a single, per-sample BAM file; now with all amplicons having the target coverage. We downsampled to a mean coverage of 500, 400, 300, 200, 150, 100, 75, 50, and 25×, in triplicate for each mock DBS sample and coverage level, producing a total of 1944 BAM files for variant calling with Delve (72 samples by 9 coverage levels, in triplicate).