HIV-1 variant analysis
PCR amplification of 6.7 kb of the HIV-1 genome of the viral gag, pol, and env genes was performed on rebound viruses recovered from infected hu-mice from untreated and LA ART-, CRISPR-, and dual-treated animal groups. LA-ART consisted of nanoformulated RPV (NRPV), and nanoformulated and myristoylated DTG (NMDTG), 3TC (NM3TC), and ABC (NMABC) as prodrugs. Rebound viruses from all treatment groups were recovered from plasma samples using a sensitive membrane-based Qiagen MinElute virus kit19. Three independent amplicons from 29 samples (22 samples from study 1 and 7 samples from study 2) were sequenced by GENEWIZ Sanger assays (Fig. 1).
The study scheme shows the infection times, treatment details, plasma collection, RNA isolation, Reverse Transcription PCR amplification, and subsequent sequencing assays and data analysis. In this study, humanized mice were either infected with HIV-1NL4-3 or HIV-1ADA and were subsequently divided into four treatment subgroups: 1- no treatment; 2- LA ART treatment consisting of NMDTG and NRPV at 45 mg/kg and NMABC and NM3TC at 40 mg/kg, administered via intramuscular injection two weeks post-infection (WPI) and ceased at 6 WPI; 3- CRISPR-Cas9 treatment at 7/8 WPI; or 4- sequential LA ART and CRISPR-Cas9 treatment with viral rebound. Plasma-derived viral RNA was isolated from all groups at designated time points, and three independent PCR-amplified HIV-1-regions (gag, pol, env) were sequenced and analyzed using Sanger sequencing. Amplicons of the pol regions were further sequenced and analyzed through the Illumina MiSeq platform. Sequences of interest, including the guide RNA targeting region, protease (PR), reverse transcriptase (RT), and integrase (IN) regions, env variable loop, and CD4 binding loop region, were analyzed using the bioinformatic pipeline. The crystal structure of unliganded HIV-1 clade B strain YU2 gp120 core is downloaded from the RCSB Protein Data Bank (https://www.rcsb.org/). Fig. 1 is created using the Biorender software.
The sequence analysis yielded high-quality sequence reads with an average quality score of greater than 45. Hypermutation was investigated using Hypermut 2.0 software20 tests, detected in only one RNA sequence, and excluded from further downstream analysis. Plasma viral RNA measures from individual humanized mice from each treatment group in all studies were provided in Supplementary Tables 2,3. Resistance genotyping tests in reverse transcriptase (RT), integrase (IN), and protease (PR) viral genes were successfully performed using the more sensitive MiSeq next-generation sequencing system. Hu-mice samples from two independent studies infected with two different viral strains [HIV-1NL4-3 (n = 22) and HIV-1ADA (n = 7)] were analyzed. In the sequencing analysis, 93.15% of the reads passed the standard quality check in the first step. During the second, more stringent quality check, one sample (4348) was excluded. We employed analysis thresholds of 2, 5, and 20% in NGS to identify minor variants with high sensitivity, as validated in prior multi-laboratory comparative studies21,22.
We sequenced full-length gag, pol, and env regions of viral RNA isolated from either HIV-1NL4-3 or HIV-1ADA-infected humanized mice plasma in two independent studies from four different treatment groups (untreated LA ART, CRISPR, and dual ART-CRISPR therapies where the rebounded virus was recovered). These were used to evaluate treatment effects on the emergence of specific viral molecular signatures. Multiple alignment analysis of HIV-1NL4-3-infected endpoint gag, pol, and env sequences demonstrated low-frequency mutations across untreated virus-infected samples, validating the existence of host-selection pressure in infected hu-mice (Fig. 2A, B and Supplementary Figs. 1,2).
A Highlighter plot of all infected and treatment groups (Untreated, CRISPR-, LA ART-, and Dual-treated with viral rebound) endpoint sequences in the HIV-1NL4_3 env region. Sequences were aligned to the most conserved sample from the HIV-1-infected untreated group [master sequence(m)]. Mutations were color-coded. B Highlighter plot of all samples belonging to all four treatment groups, endpoint sequences in the HIV-1NL4_3 env region with silent (green) and non-silent (red) mutations. C Frequency of mutations detected in gag, pol, and env regions derived from study one (HIV-1NL4_3) endpoint untreated (n = 3), CRISPR-treated (n = 5), LA ART-treated (n = 7–9), and dual-treated and viral rebound samples (n = 5). Statistical comparisons between the different treatment groups were analyzed using one-way ANOVA. Data represent the mean ± standard error for 4 different groups with independent biological replicates.
While the increased frequency of mutations, including nucleotide substitutions and indels, was observed in all treated viruses in the gag, pol, and env regions (Fig. 2A, B, Supplementary Figs. 1,2), the env region possessed a greater mutation frequency. It showed higher evolutionary divergence than the other structural viral genes in the same treatment group. HIV-1ADA viruses showed higher mutation frequency in env than in the gag and pol regions (Supplementary Figs. 3,5). No statistical significance was observed using one-way ANOVA between the treatment groups from HIV-1NL4-3-infected mice (Fig. 2C). A trend suggesting an increased mutation rate was identified as non-significant in the HIV-1ADA-infected mice group using the non-parametric Kruskal-Wallis’ test (Supplementary Fig. 6), suggesting the existence of viral strain specificity. These results supported viral genetic dynamics under parallel treatment conditions23.
Correlation Analyses
We performed Spearman’s rank correlation analyses on HIV-1NL4-3 (Table 1) and HIV-1ADA (Supplementary Table 1) infected viral samples to investigate whether the endpoint variant frequency detected was associated with the endpoint plasma viral load. The HIV-1NL4-3 nucleotide endpoint mutation frequencies in the pol region had an inverse correlation with the plasma viral load (r value = - 0.5879, p-value = 0.008) (Table 1). We did not observe any significant correlation between endpoint mutation frequency and plasma viral load in the gag and env regions (Table 1). The r values for gag and env regions are −0.207 and 0.166, and the p-values for the same areas are 0.382 and 0.485, respectively. We did not observe any significant correlations in HIV-1ADA-infected samples (Supplementary Table 1) with the noted sample size.
Longitudinal HIV-1 variant analyses
With the known notable diversity reported in the clinical samples in the HIV-1env region and as observed in our endpoint env sequences, we conducted a longitudinal sequence analysis of the samples displaying high levels of nucleotide variation to eliminate pre-existing variants before LA ART treatment. We mapped the env nucleotide variants identified from five animals (3139, 3171, 3182, 4356, 4358) before ART treatment, following ART and before CRISPR, and at the study endpoint (Fig. 3A). We examined potential variant dynamic changes (Fig. 3B, C). The higher mutation frequencies observed in the env region sequences from animals treated with LA ART or CRISPR at the endpoint relative to controls indicated that some mutations were either generated during or after the treatments (Fig. 3B, C, Supplementary Fig. 7). Nonetheless, the mutations observed over time reflected the CRISPR and LA ART influence on viral evolution, and this is the first report from a multimodal treatment paradigm in an HIV-disease-specific animal model.
A Mapping of nucleotide variants detected in the env region sequences from the five selected HIV-1NL4-3 animal groups (identified by the mouse ID on the right axis) at pre-ART (PRE), post-ART/pre-CRISPR (MID), and endpoints (END) (shown on the left axis). Individual nucleotide substitutions were depicted with distinct colors per the key below the graph. B The frequency of env variants at pre-ART (n = 5), post-ART/pre-CRISPR (n = 3), and endpoint (n = 5) was compared and analyzed using one-way ANOVA. Individual animals were represented with unique color-coded dots and lines (3139, 3171, 3182, 4356, and 4358). A significant increase in mutation frequency was observed (p = 0.019). C Donut charts illustrate the proportion of env nucleotide variants from individual mice at the specified time points, with pre-ART, post-ART/pre-CRISPR, and endpoint phases denoted by blue, red, and orange, respectively.
Drug resistance-associated mutations (DRAMs) detection by SS
We performed DRAMS screenings to detect and re-align mixed-up samples for integrative studies on our data obtained using the Stanford drug-resistance database and validated using the International AIDS Society (IAS-USA) mutations list 2022, and identified more drug-associated mutations in the HIV-1NL4-3 and HIV-1ADA samples (Table 2). One accessory integrase resistance mutation, V151I, was identified in all plasma samples from HIV-1NL4-3-infected hu-mice. This affirmed the susceptibility to the drug(s) used for viral suppression. A primary NRTI-associated mutation, M41L, was identified in all the samples from HIV-1ADA-infected hu-mice samples. M41L has been reported as a significant RT resistance mutation in clinical studies of infected patients treated with either abacavir, azidothymidine, or tenofovir disoproxil fumarate24. We also identified one protease inhibitor (PI)-specific accessory mutation, A71I, in one sample belonging to the dual-treatment rebound group, which is shown to be associated with other PI-resistant mutations24.
DRAMs detected using Next Generation Sequencing (NGS)
We further performed MiSeq NGS to examine the presence of low-frequency drug-resistant sequence variants in the pol subregions, protease, reverse transcriptase, and integrase. Twenty-five plasma viral samples were tested from HIV-1NL4-3 or HIV-1ADA-infected hu-mice belonging to four treatment groups and derived from samples before ART and at the study’s conclusion (Fig. 4A). One sample failed to pass the stringent quality check and was excluded from the analysis. Using the Stanford NGS drug-resistance database algorithm, additional DRAMs were detected in RT, PR, and regions (Table 3). NNRTI primary mutation E138K and accessory mutation H211Y were observed at low frequency in our NGS analysis at a 2% threshold in sample 3169 (an LA ART-treated sample). NNRTI accessory mutation K101E was observed at NGS 5% and 2% threshold in one sample (3181) belonging to the LA ART-treated group. NRTI accessory mutation V75I was detected at NGS 5% and 2% threshold in another LA ART-treated sample (3182). One INSTI primary mutation, R263K, was detected at NGS 2% in a dual-treated and viral rebound sample (4375). Another INSTI accessory mutation, V151I, was detected at NGS 20%, 5%, and 2% thresholds in all the samples from all four groups. One PI-associated primary mutation I54T was observed at NGS 5 and a 2% threshold in one LA ART-treated sample (3170). Another PI accessory mutation, F53L, was detected only at the NGS 2% threshold in an LA ART-treated plasma sample (3182). Interestingly, all the DRAMs identified in our highly sensitive low threshold NGS analysis either belonged to the LA ART alone or LA ART and CRISPR-treated viral rebound groups (Fig. 4B). The predicted major antiretroviral drug susceptibility profiles were further analyzed for all samples from HIV-1NL4-3 and HIV-1ADA-infected mice using the ART resistance algorithm25 and presented in Fig. 4B. We didn’t detect any drug-resistant-associated mutations in the HIV-1-infected untreated or CRISPR-alone hu-mice samples, so they were not given in the significant ART-susceptibility analysis diagram. Overall, 41.6% (5/13) of HIV-1NL4-3-infected hu-mice and 100% (4/4) of HIV-1ADA-infected hu-mice had at least one mutation (Fig. 4B). A total of 7.7% (1/13), 15.4% (2/13), 15.4% (2/13), and 7.7% (1/13) HIV-1NL4-3 infected hu-mice were found resistant to nucleoside reverse transcriptase inhibitors (NRTIs) with low resistance, NNRTIs with medium to high resistance, PIs with low resistance, and integrase strand transfer inhibitors (INSTIs) with medium resistance data respectively. All HIV-1ADA-infected humanized mice exhibited low resistance to NRTIs but no resistance to any other ART category. Interestingly, we identified new mutations with unclear function in all HIV-1NL4-3-infected humanized mice and 25% (1/4) of the HIV-1ADA-infected hu-mice. These mutations have been shown to increase the replication fitness of viruses when present along with other INSTI/PI-resistance mutations24, but further investigation is needed.
A Next-generation sequencing (NGS) was used to screen the low-frequency HIV mutants (less than 15% detection threshold) in the protease (PR), reverse transcriptase (RT), and integrase (IN) regions within pol amplicons. A total of 25 humanized mouse samples belonging to either HIV-1NL4-3 or HIV-1ADA-infected differential treatment groups from pre-ART and endpoint were subjected to the NGS analysis. B A heap map shows the predicted major antiretroviral drug susceptibility profiles based on the Stanford drug resistance database from HIV-1NL4-3 and HIV-1ADA-infected and differentially treated samples. (The colors used to indicate the results are red, high-level resistance; orange, medium-level resistance; pink, low-level resistance; blue, sensitive; dark purple, unclear functionality.) The four ARVs used for this study were highlighted with a red arrow.
Comparison of DRAMs between SS and NGS detection systems
The rate of detecting HIV-1 mutations using single-genome sequencing, with that of next-generation sequencing (NGS) at 20%, 5%, and 2% thresholds, was presented in Fig. 5A. We first compared the mutation rate of single-genome sequencing with that of NGS at a 20% threshold. This reflected the variant detection limit of Sanger sequencing (SS), which is known to be ~20%26. Prior clinical studies reported discrepancies in the detection of DRAMs using NGS at a 20% threshold compared to SS results for various HIV-1 genomic subregions27,28,29. Here, in our current humanized mice study, we observed a high concordance in the detection of DRAMs between NGS and SS at a 20% threshold in the reverse transcriptase, protease, and integrase region sequences. Compared to Sanger Sequencing and NGS at 20% thresholds, NGS at 2% and 5% allowed better detection of primary and accessory DRAMs in our humanized mice plasma samples (Fig. 5B). A total of 7 DRAMs, including four nucleoside reverse transcriptase/nonnucleoside reverse transcriptase (NRTI/NNRTI)-, 2 PI-, and one integrase-associated DRAM, were detected using NGS at 5 and 2% analysis.
A Bubble chart representing the frequency of predicted DRAMs to NRTIs/NNRTIs, PIs, and INSTIs (using the Stanford drug resistance database standard platform) detected in HIV-infected humanized mouse samples. The size of the bubbles indicates the count of mice with each mutation, identified using Sanger sequencing (SS) and next-generation sequencing (NGS) at varying detection thresholds (2%, 5%, 20%). B The DRAMs and their respective proportion detected using SS and three NGS thresholds (2%, 5%, 20%) were presented using the grid graph, with color coding specific to each amino acid mutation, and additional mutations detected were highlighted with dotted squares. Significant mutations are shown with asterisks—SS, Sanger sequencing, NGS, and Next-generation sequencing.
Unique Mutations detected using NGS
By employing NGS as described in Fig. 4A, we not only detected lower-frequency DRAMs across three PR, RT, and IN HIV-1-regions of the pol gene amplicons from our samples but also identified new and unique mutations that have never been reported in the drug resistance database (Fig. 6A, B). Upon comparing the distribution of the new mutations across different treatment groups, we observed no significant clustering of the mutations within any treatment group. Notably, these mutations were present in all treated groups in HIV-1NL4-3 and HIV-1ADA-infected animals (Supplementary Figs. 8,9). However, we found a region-specific pattern in the distribution of new mutation frequencies, with the RT region exhibiting the highest, followed by integrase (IN), and then the protease (PR) region, which displayed the lowest frequency (Fig. 6A). The mutations in RT regions of the three treatment groups are shown in Fig. 6C–E with color-coded amino acid changes and their respective frequencies. The mutation frequency is the proportion of mutation positions relative to the consensus within each sample group. Overall, we found that the LA ART-treated group harbored more new mutations than the CRISPR-treated and dual-treated groups (Fig. 6C–E), and each treatment group displayed a unique mutation signature. In the LA ART-treated group, greater than 50% of substitutions were at positions I375, V381, and E523. In the CRISPR group, substitutions more significant than 50% were observed at position R35, while in the dual-treated viral rebound group, no single mutation reached a frequency exceeding 50%. Interestingly, three new common amino acid substitutions (RT codon positions 29, 151, and 316) emerged in both LA ART (Fig. 6C) and the dual-treated group (Fig. 6E). In contrast, one new amino acid substitution (RT codon position 460) was observed in both the CRISPR-treated (Fig. 6D) and dual-treated rebound groups. Notably, unique mutations E29G and Q151R were exclusively observed in the LA ART and dual-treated groups.
A, B Distribution of new and unique mutations in the protease (PR), reverse transcriptase (RT), and integrase (IN) regions of rebound viruses belonging to the different treatment groups infected with either HIV-1 strains NL4_3 or ADA. Violin plots depict the frequency of new mutations in each region relative to the total mutations identified in the individual sample. The frequency of new mutations identified in PR, RT, and IN regions from (n = 16) samples (HIV-1NL4_3 group) and (n = 4) samples (HIV-1ADA group) was analyzed using Type III Analysis of Variance Table with Satterthwaite’s method (p values shown in the figure). Comparison between the two regions was performed using Student’s t-test (two-tailed, paired) (C–E). The pattern of polymorphism and mutations associated with new mutations identified in the RT region of HIV-1NL4_3 treatment groups. Bar plots represent the frequency of new mutations in the RT region across different treatment groups: LA ART, CRISPR, and the dual-treated rebound group. Each bar represents the frequency of a specific amino acid mutation at a consensus position, with the height of the bar corresponding to the percentage frequency, and those shared between groups are indicated by consistent color coding. The alphabets represent single-letter amino acid codes.
CRISPR-associated and CD4 binding loop mutation analysis
Previous studies have reported CRISPR-mediated indels and mutations30,31. We detected no CRISPR-mediated indels or nucleotide substitutions at or near our gag gRNA targeting region cleavage site (Supplementary Fig. 10A). Interestingly, non-synonymous mutations were observed within the CD4 binding loop region in (2/5) dual-treated samples, resulting in amino acid changes from glycine to arginine (Supplementary Fig. 10B).
First, we performed Sanger sequencing to analyze viral sequences at the endpoint from HIV-1-infected and untreated and treated with either LASERART alone, CRISPR alone, or dual therapy-treated rebound animals, as presented in Figs. 1, 2. This analysis revealed a limited number of mutations. Second, to determine whether these mutations were pre-existing or generated during the treatment, we extended our analysis to already stored frozen plasma pre- and post-ART samples, highlighted in Fig. 3. Surprisingly, Sanger sequencing did not identify significant drug-resistance mutations in the rebound animals despite the administration of a combination of four long-acting antiretroviral drugs (Dolutegravir, Lamivudine, Abacavir, and Rilpivirine) in our study; this could be attributed to the sensitivity of detection of Sanger sequencing in identifying DRMs at 15-20%. Third, this unexpected result led us to employ a higher-resolution sequencing method, Illumina MiSeq NGS, to investigate if there are undetected DRMs in the single- and/or double-treated rebound animal samples. Using the NGS Illumina MiSeq platform, we detected drug resistance and new mutations in the viral sequences, as presented in Figs. 4–6. Fig. 1 integrates data from Figs. 1,2 and Table 1 (Sanger sequencing results) on the bottom left and Figs. 3–6 and Tables 2, 3 (Illumina MiSeq NGS results) on the bottom right.





