Single-cell analyses identify monocyte gene expression profiles that influence HIV-1 reservoir size in acutely treated cohorts

Categories: Disease & Virus

May 30, 2025

Study design

Demographic and clinical data (viral load, CD4+ T cell counts, HIV DNA, HIV subtype, Fiebig stage) were available from 163 acutely-treated PLWH from the men who have sex with men (MSM) RV254/SEARCH010 cohort (NCT00796146) in Thailand^9,21,62. For discovery analyses, scRNA-seq, immune receptor sequencing (TCR and BCR), and flow cytometry were performed on initially cryopreserved PBMC from 14 selected participants in this cohort and validated in an additional 38 male participants from the ACTG A5354 study (NCT02859558), a single-arm, open-label study to evaluate the impact of ART initiation during AHI conducted at 30 sites in the Americas, Africa, and Asia⁶³. Samples from all participants from both prospective studies were collected at 48 weeks post-ART initiation. In a subset of participants, samples were also available from week 0 at AHI. Blood from healthy participants without HIV for in vitro experiments was obtained from the WRAIR 2567.05 institutional protocol. All participants from the aforementioned human studies provided informed consent and use of samples for research was approved by ethical review boards at the Walter Reed Army Institute of Research, USA, Chulalongkorn University Faculty of Medicine, Thailand and Advarra, USA.

HIV Reservoir measurements

Total HIV DNA and integrated DNA were measured by quantitative PCR (qPCR)^10,25. Briefly, pellets of PBMC and CD4+ T populations were suspended in 15 µl of Proteinase K lysis buffer per approximately 100,000 cells and digested for 18 h at 55 °C. Total HIV DNA was quantified using primers and a probe situated in the 5’-LTR, while primers and probe used for integrated DNA were situated in Alu and the 5’- LTR. ACH-2 cells (BEI Resources, NIAID, NIH; accession# ARP-349), which carry a single copy of the integrated HIV genome, were used to generate a standard curve for both assays. The cell input for each of the three replicates was approximately 100,000 per replicate (~300,000 total) and the lower limit of detection (LOD) of this assay was 3.3 copies/10⁶ cells. LOD was calculated based on the number of cells analyzed followed by normalization to 10⁶ cells. Participants were grouped into detectable or undetectable reservoir based on the presence of total HIV DNA measured independently in both PBMC and CD4+ T cell populations, depending on sample availability for the latter. The presence of integrated HIV DNA, where available, was used as a confirmatory criterion for verifying the categorical groupings. The reservoir phenotype was defined as undetectable when both total and integrated HIV DNA were below the LOD. In contrast, the reservoir was defined as detectable when total HIV DNA > LOD.

IPDA® assay

Accelevir Diagnostics performed HIV-1 intact proviral DNA assays (IPDA®) to discriminate between, and separately quantify, the frequencies of intact and defective persistent proviruses. The design and performance of this assay have been described previously^43,64. Briefly, cryopreserved PBMC were thawed and CD4+ T cells were isolated and assessed for cell count, viability, and purity by flow cytometry. RNA-free genomic DNA was then isolated from the recovered CD4+ T cells, with concentration and quality determined by fluorometry and ultraviolet-visible (UV/VIS) spectrophotometry, respectively. The IPDA® was performed, and data were reported as proviral frequencies per million input CD4+ T cells. These procedures were performed by blinded operators using standard operating procedures.

Single-cell RNA library generation and sequencing

PBMC from the 14 RV254 participants on ART for 48 weeks were washed, resuspended in PBS plus 0.5% FBS, and simultaneously processed for scRNA-seq and flow cytometry. A total of 50,000 cells (at 1000 cells/µl) from each participant was set aside for scRNA-seq library construction and the remaining cells were used for flow staining as described later. The diluted PBMC suspensions were prepared for scRNA-seq using the Chromium Next GEM 5’ Single Cell V(D)J Reagent Kit v1.1 (cat# 1000165) and the Chromium Controller (both 10x Genomics) per manufacturer’s instructions. Briefly, targeting a recovery of 8000 cells/participant, samples were loaded into separate wells of Chromium chips. Amplified cDNA was used to make gene expression (GEX), TCR, and BCR libraries. The GEX library construction used a 14 or 16 cycle Sample Index PCR program, based on amplified cDNA concentrations. PBMC from the 38 A5354 participants were individually stained with TotalSeq-C anti-human hashtag antibodies (BioLegend), batched, and processed for gene expression (GEX) and hashtag oligo (HTO) libraries to improve cost-effectiveness²³. Cells from each batch were loaded into 4 different wells of Chromium chips for targeted recovery of 16,000 cells/well.

Libraries from both studies were then assessed for quality and concentrations using the DNA High Sensitivity D5000 ScreenTape Assay (cat# 5067-5592) with the TapeStation (both Agilent), and subsequently pooled and quantitated with a MiSeq Nano Reagent Kit v2 (300 cycles) (Illumina; cat# MS-103-1001) sequencing run. Final libraries were sequenced using the NovaSeq 6000 S4 Reagent Kit (300 cycles) (cat# 20012866) on a NovaSeq 6000 instrument (both Illumina).

Multiparameter flow cytometry

PBMC from 14 participants were stained with Aqua Live/Dead stain (cat# L34957), washed, and blocked using normal mouse IgG (cat# 10400 C) (both ThermoFisher). The cells from each participant were then split into four to run four different polychromatic flow panels using conjugated fluorescently labeled monoclonal antibodies against several surface markers to define B, T, Myeloid, and NK cell subsets (Supplementary data file 2, 3). For the T cells panel, cells were pre-stained with an MR-1 tetramer⁶⁵ prior to staining for additional surface markers. Following surface marker staining, cells were washed, permeabilized and fixed with eBiosciences FoxP3 Fixation/Permeabilization Set (ThermoFisher; cat# 00-5521-00). Cells were then washed, stained intracellularly, washed again, and analyzed using a FACS Symphony A5 (BD BioSciences). Data were analyzed with FlowJo v.9.9.6 or higher (Becton Dickinson).

Virus production

HEK293T cells (ATCC; accession# CRL-3216) were transfected with 5 µg NL4-3 or CH058, or 5 µg pMorpheus plus 1 µg of designated envelope constructs using the TransIT-LT1 (Mirus; cat# MIR2306) transfection reagent per manufacturer’s protocol. Infectious molecular clones of HIV CH058 and CH077 were kindly provided by Beatrice H. Hahn^56,66, while pMorpheus was provided by Viviana Simon⁵⁴. Media were changed 24 hrs post transfection and virus stocks were collected 24 hrs later. PBMC were infected with freshly produced virus. The HIV YU-2 infectious molecular clone stock was obtained from the HIV Reagent Program (BEI resources; accession# ARP-1350).

In vitro functional characterization

Effects of IL1B on cell population frequencies and HIV infection

PBMC were isolated from the blood of healthy participants by density centrifugation on a Ficoll-Paque Plus (GE Healthcare; cat# GE17144002) gradient and stimulated by anti-CD3/CD28 Dynabeads (Gibco; cat# 11132D) at a 1:1 ratio with the estimated CD4+ T cell population in PBMC (25% in total PBMC) in Complete Cell Culture Medium (RPMI 1640 Medium with GlutaMAX and HEPES (cat# 72400047), 10% fetal bovine serum (cat# A3840001), and penicillin/streptomycin (cat# 15140122)) (all Gibco) supplemented with 40 U/ml IL2 (cat# 130-097-743) and with or without recombinant IL1B (cat# 130-095-374) (both Miltenyi Biotec) at four different concentrations (0.01-10 ng/ml, at 10-fold intervals) for 4 days. Treated PBMC were either immediately analyzed using a FACSymphony A5 (BD BioSciences) to assess frequencies of T cell subpopulations, or infected with an R5 tropic molecular clone, YU-2, at a concentration of 1 µg of p24 per million cells and cultured for a further 2 days before assessing the relative frequencies of infected cells by flow cytometry (BD BioSciences).

Effects of IL1B on HIV infectivity

PBMC isolated from healthy participants were treated with 10 ng/ml IL1B concentrations at different times relative to HIV infection initiation: pre-treated 2 days prior to infection, added simultaneously, or added 2 days post-infection. Briefly, PBMC were isolated by Ficoll gradient centrifugation and cells were stimulated by PHA (ThermoFisher; cat# R30852801) in Complete Cell Culture Medium with 100 U/ml IL2 for 3 days. On day 1 post-isolation the required cells were set aside for IL1B treatment for 2 days prior to infections. On day 3 post-isolation, PBMC were infected by spinoculation (1200 x g, 2 hrs, 26 °C) with 150 ng of freshly produced NL4-3, or 500 ng of CH058 virus strains, per million cells. After spinoculation cells were washed 5 times with 1x PBS (Gibco; cat# 14190-094) and resuspended in fresh medium containing IL2 and IL1B per the schedule. Cells were cultured for an additional 9 days during which 400 µl of supernatant was removed every second day for determination of infectious virus yields. To determine infectious virus yield, 10,000 TZM-bl reporter cells (BEI Resources; accession# ARP-8129) per well were seeded in 96-well plates. The next day cells were infected in triplicate for 3 days with the collected supernatants. Three days post-infection the TZM-bl cells were lysed and b-galactosidase reporter gene expression was assessed with the Gal-Screen Reporter Assay System (Invitrogen; cat# T1028) per manufacturer’s protocol using an Orion microplate luminometer (Berthold).

Flow cytometry staining of pMorpheus infection

PBMC from healthy participants were isolated by using separation medium, stimulated for 3 days with PHA (2 μg/ml), and cultured in RPMI 1640 medium with 10% fetal calf serum and 10 ng/ml IL-2 prior to infection. Some of the cells were treated with IL1B 2 days before pMorpheus infection, some at the time of infection, and some 2 days after. Five days postinfection, cells were collected and surface-stained against V5 (V5 Alexa Fluor 647, ThermoFisher; cat# 451098). After 30 min incubation, cells were washed 3x with PBS, fixed with 4% PFA (ChemCruz; cat# sc-281692) for 30 min, and analyzed by flow cytometry (FACSCanto; BD BioSciences).

Western blotting

PBMC were isolated as described and cultured for 3 days with IL2 and PHA. On day 3, cells were treated with IL1B (10 and 50 ng/ml) and TNFα (10 ng/ml) (Miltenyi Biotec; cat# 130-094-019) for 1, 1.5, and 2 hrs. To generate cell lysates, cells were washed in PBS and subsequently lysed in Western blot lysis buffer (150 mM NaCl (Sigma-Aldrich; cat# 1.06404.1000), 50 mM HEPES (Sigma-Aldrich; cat# 7365-45-9), 5 mM EDTA (Sigma-Aldrich; cat# E9884), 0.1% NP40 (USBio; cat# 19628), 500 μM Na3VO4 (Sigma-Aldrich; cat# 450243), 500 μM NaF (Sigma-Aldrich; cat# 201154), pH 7.5)⁶⁷. After 5 min of incubation on ice, samples were centrifuged (4 °C, 20 min, 12,000 × g) to remove cell debris. Whole cell lysates were mixed with 4× Protein Sample Loading Buffer (LI-COR, at a final dilution of 1×; cat# 928-40004) supplemented with 10% β-mercaptoethanol (Sigma-Aldrich; cat# M6250), heated at 95 °C for 5 min, separated on NuPAGE 4 ± 12% Bis-Tris Gels (Invitrogen; cat# 10472322) for 90 min at 100 V and blotted onto Immobilon-FL PVDF membranes (Merck Millipore; cat# IPFL00010). The transfer was performed at a constant voltage of 100 V for 1 h using a wet/tank transfer system (BioRad; cat#1703930). Proteins were stained with the following primary antibodies: phospho-IκBα (Cell Signaling; cat# 2859), IκBα (Santa Cruz; cat# sc-1643), and GAPDH (BioLegend; cat# 607902).

NF-κB reporter assay

A549-Dual^TM Cells (InvivoGen; cat# s549d-nfis) were seeded at a density of 20,000 cells per well on 96-well plates. Cells were treated on the following day with IL1B (10 ng/ml), TNFα (10 ng/ml), or LPS (1000 U/ml) (Invitrogen; cat#15536286) and infected or not with VSV-G pseudotyped HIV-1 NL4-3 or CH058 for 24 hrs, when the Quanti Blue assay was performed as described by the manufacturer (InvivoGen; cat# CRL-3216).

Viability assay

Cells were harvested, washed once with PBS and stained for 15 min at room temperature in the dark with eBioscience fixable viability dye 780 (ThermoFisher; cat# 65-0865-14). Cells were then washed twice with PBS, fixed in 2% PFA for 30 min at 4 °C and analyzed by FACs.

Bioinformatics analyses

Sequence data processing

Single-cell gene expression data from PBMC were generated using the 10x Genomics Cell Ranger pipeline (v3.0.0 – 3.1.0) (10x Genomics) per manufacturer’s recommendations and the 10x Genomics human reference library (GRCh38 and Ensembl GTF v93). For the RV254 sequencing run without hashing, the average number of genes per cell was 1236 and the average number of unique molecular identifiers (UMI) was 3288. The mean read depth per cell was approximately 103,000-236,000 reads. The minimum fraction of reads mapped to the genome was 92.95% and sequencing saturation was on average greater than 94%. For the hashed A5354 sequencing runs, the average number of genes per cell was 1432 and the average number of UMIs for RNA transcripts was 4192. The mean read depth per cell was approximately 69,000-88,000 reads for the gene expression library and 9000-14,000 reads for the antibody library. The minimum fraction of gene expression reads mapped to the genome was 88.5% and RNA sequencing saturation was, on average greater than 89%. Downstream analysis of Cell Ranger outputs, including quality filtering, normalization, multi-sample integration, visualization, and DEG were performed using the R package Seurat (v3.1.1 – 4.3.0).

RV254 gene expression processing

Cells with mitochondrial percentages greater than 10% and cells that had <200 or >6000 expressed genes were removed from analyses. 62,925 cells remained after the quality control (QC) process. After log-normalization with a scale factor of 10,000, the top 2000 variable features within each sample were selected. We found integration anchors using dimensions 1:30 and integrated cells from all 14 participants. Shared Nearest Neighbor-based (SNN) clustering was performed using the top 30 principal components (PC) with a resolution of 0.5, and cells in the clusters were visualized by UMAP projection. Cluster marker genes were determined using Seurat FindAllMarkers and cluster identities were manually annotated using differentially expressed genes between the clusters and known lineage cell markers (https://github.com/thomaslab-MHRP/scRNA-seq_annotation_resources/blob/main/GEX_markers_for_cell_annotation.md).

A5354 demultiplexing and gene expression processing

HTO expression matrices were normalized, demultiplexed, and assigned to specific participants using the methods described²³. Negative cells and cells with greater than 10% mitochondrial gene expression were removed. Gene expression matrices (containing a total of 21,870 genes) for all 38 participants and for doublet cells were normalized. We performed reference-based integration using two participants from each ADT batch as references. Cells that were identified as doublets via hash demultiplexing and cells in clusters from an initial round of QC that were enriched for doublets or had high expression of HBB were removed, and SNN clustering at resolution 0.3 was performed on the remaining 140,172 cells. Clusters were visualized and annotated using lineage markers and differentially expressed genes similar to the process for RV254. No γδ T cell or monocyte-platelet aggregation clusters were identified, and memory CD4+ T cells were comprised of one large cluster and one smaller cluster with upregulation of interferon-induced genes, instead of subsets of CD4+ T cells as observed in RV254.

Differential gene expression

Categorical differential gene expression analyses within each cell type subset between the two reservoir groups was performed within Seurat using the Mann-Whitney U two-sided test with Bonferroni correction (n = 19,581). Genes that were not expressed in at least 10% of cells in either group or that did not have a log fold change of >|0.25| were excluded from consideration, as were mitochondrial and ribosomal protein genes. The MAST framework was implemented to examine correlation of gene expression of different cell subsets with the continuous log-transformed total HIV DNA measurements as the outcome³⁰. Genes with expression frequencies <10% were removed before analyses. Results from each cell subset were corrected for multiple testing using the Bonferroni correction. Genes without a beta coefficient >|0.1| and additional manually curated genes were excluded from consideration. Continuous MAST analyses for a subset of 21 participants with IPDA® data was performed to see if IL1B remained significant using different reservoir measurement parameters. Participant-specific expression values were generated using Seurat’s AverageExpression in CD14+ monocytes within participants on the log-normalized expression data.

TCR/BCR sequence analyses

TCR/BCR clonotype identification, alignment, and annotation were performed using the 10x Genomics Cell Ranger pipeline (v6.1.2; 10x Genomics) per manufacturer’s recommendations. Clonotype alignment was performed against the Cell Ranger human V(D)J reference library 7.1.0 (GRCh38 and Ensembl GTF v94). The Cell Ranger clonotype assignments were used for both BCR and TCR Clonotype visualization and diversity assessments, and analyses were performed using R for IG chains within annotated B cell types (memory B cells, naïve B cells) or TRA/TRB chains within annotated T cell types (CD4+ or CD8+ T_CM, T_EM, and naïve T cells).

Pathway analyses

Further DEG lists characterizing the detectable and undetectable reservoir groups within cell subsets from RV254 were used to perform a multiple gene list analysis in Metascape to acquire the top 20 representative terms of the most significantly enriched pathways⁶⁸. The genes comprising each of these 20 pathways were used as input lists to perform Gene Set Enrichment Analysis (GSEA)⁶⁹ when comparing the detectable and undetectable groups, along with an average expression matrix of all genes within each cell subset for each participant that was generated from the single-cell data. The GSEA results were filtered by normalized enrichment score (NES) ≥|1.4 | , P < 0.001. For WGCNA-based pathway analyses, the CD14+ monocyte cell subset of the RV254 cohort Seurat object was used as input for coexpression analyses implemented in the single-cell R package, hdWGCNA^70,71. Metacells and a signed network were constructed within participants using non-default parameters (k = 25, max_shared=10, and soft power=9). The top 25 hub genes for each of the resulting modules were used as a feature set for Seurat’s AddModuleScore to generate a score for each module within each cell. The Mann-Whitney U two-sided test was used to compare the expression of the module scores between cells in the detectable and undetectable reservoir categories. This module scoring and testing method for the same sets of genes from RV254 was applied to the CD14+ monocyte cell subset in the independent A5354 cohort. The average scaled expression of the 25 hub genes from the M3 module containing IL1B within both cohorts was used as input for the ComplexHeatmap tool⁷². Similarly, gene modules were identified in total memory CD4+ T cells and gene ontology analyses were performed using Enrichr⁷³. The 25 hub genes for the M3 CD14+ monocyte module were used as input in a protein STRING DB pathway analysis⁷⁴. The disconnected nodes were removed, and the resulting network was investigated for degree of connectedness and visualized in Cytoscape⁷⁵.

Statistical analyses

The associations between 117 phenotypic flow cytometry population frequencies and reservoir size were assessed by univariate linear regression models and corrected for multiple testing using false discovery rate (FDR). Exploratory analyses including multiple regressions without adjusting for significance were also performed to evaluate the relationship between the reservoir size as the response variable and two explanatory variables: THBS1/IL1B and each flow cytometry cell population using machine learning methods in R⁷⁶. Finally, multiple regression models were fitted with two-way interaction terms between THBS1/IL1B and each phenotypic population marker, to test whether the effect of THBS1/IL1B on decreased reservoir size differed depending on the frequencies of individual cell subsets. Interaction plots for THBS1/IL1B were made to illustrate how the relationship between THBS1/IL1B expression and reservoir size changes with different frequencies of combined CD4+ T_CM. The overall fitness of the simple regression models of the combined CD4+ T_CM population was evaluated using the coefficient of determination, R-squared value (R²), and Root Mean Squared Error (RMSE). For multiple linear regression of the CD4+ T_CM cells, the goodness-of-fit was measured using both R² and adjusted R² along with RMSE. The prediction error of the combined CD4+ T_CM cell models was estimated using Leave One Out Cross Validation (LOOCV) and the test RMSE value was reported. Assessment of model diagnostics carried out using both the gvlma() function in R and diagnostic plots (Q-Q plot for normality, residuals vs. fitted values for homoscedasticity, leverage plots for influential observations, variance inflation factors for multicollinearity; not shown) showed that the assumptions of the linear models were reasonable after removing one outlier identified using Cook’s distance. All explanatory variables for all regression analyses were mean-centered, and plots show predicted measurements.

All comparisons were two-sided, using appropriate statistical tests for paired or unpaired analysis. Correlations were performed by Spearman’s rank correlation coefficient and monotonic lines showed directionality. A two-sided P value of <0.05 was considered statistically significant for all statistical analyses. Bonferroni or FDR corrections were applied for multiple testing when appropriate. All descriptive and inferential statistical analyses were performed using R 3.4.1 GUI 1.70 build (7375) v3.0 and higher, and GraphPad Prism 8.0 statistical software packages (GraphPad Software, La Jolla CA).