Basic characteristics and subtype distribution
The number of people newly reported in the city in 2017–2020 and previously reported before 2017 who did not start antiretroviral therapy in middle-aged and older adults over 50 years of age totaled 1253, and the final inclusion was consistent with 1249, with 898 cases successfully amplified to obtain HIV-1 pol region sequences, accounting for 71.90%. Genetic subtypes of different genotypes were clustered in the phylogenetic tree (Fig. 1), and seven were identified.
The study population was mainly aged 60–69 years (359,40.00%), male (673,74.90%), Han Chinese (895,99.70%), married (474,52.80%), elementary school educated (506,56.30%), with heterosexual transmission as the main route of infection (880,98.00%), and domicile in FS (343,38.10%) and RX (178,19.80%) counties. The molecular type was dominated by CRF01_AE (331,36.86%), CRF07_BC (368,40.98%), CRF08_BC (173,19.27%), followed by CRF85_BC (21,2.34%), CRF57_BC (3,0.33%), CRF64_BC (1,0.11%), and CRF55_01B (1,0.11%). The differences between the distribution of HIV-1 subtypes were statistically significant (P < 0.05) for age and domicile location but not for other characteristics. Age 60–69 years (48.90%) was higher for the CRF01_AE subtype, and age 50–59 years was predominant for the other subtypes. Several subtypes were distributed in all districts and counties, with the largest proportion in FS County. For details, see (Table 1)and (Fig. 2).
Maximum likelihood phylogenetic tree of 898 pol sequences of HIV-1 in a city in southern Sichuan. 331 HIV-1 sequences were branched with the CRF01_AE reference sequence (depicted in green), 368 HIV-1 sequences were branched with the CRF07_BC reference sequence (depicted in pink), 173 HIV-1 sequences were clustered with the CRF08_BC reference sequence (depicted in purple), and 21 HIV-1 CRF85_BC sequences were identified (depicted in yellow), 3 CRF57_BC sequences (depicted in light blue), 1 CRF64_BC sequence (depicted in dark blue), and 1 CRF55_01B (depicted in dark brown) were also identified—the reference sequences from the Los Alamos HIV sequence database. On the upper edge of the CRF01_AE group, a number of seemingly ambiguous sequences were typed as well as verified and confirmed by the HIV BLAST and RIP functions in the HIV database page, all of which were CRF01_AE.
Distribution of HIV-1 genotypes in different areas of a city in southern Sichuan: A, B, C, D, E, and F. The six maps represent the distribution of the overall and the number of genetic subtypes in each year from 2016 to 2020, respectively. Different colors of the bar graph represent different subtypes, and the height of the bar graph represents the number of each subtype, with the highest value being 76; the darker the background color of the map, the higher the number of infected people in the region.
Characteristics of HIV-1 molecular transmission network
In order to construct the molecular propagation network diagram, the comprehensive influence of the selection of gene distance threshold on the resolution and access rate of molecular clusters was fully considered, and the optimal gene distance of 1.1% was selected to construct the molecular propagation network (Fig. 3A-B). There were 601 sequences entered the molecular propagation network, with a total entry rate of 66.90% (601/898), and the network consisted of 95 propagation clusters (39 CRF01_AE, 40 CRF07_BC, 14 CRF08_BC, and two others), 601 nodes, and 3144 edges. There were 48 macromolecular clusters (≥ 3 nodes) (16 CRF01_AE, 22 CRF07_BC, 9 CRF08_BC), with the number of individuals within the clusters ranging from 3 to 63; the range of each node degree was 1–57, with a median of 6. The most significant molecular cluster was subtype CRF01_AE cluster 1, consisting of 63 sequences, 50 males and 13 females, all of which were heterosexual contact infection transmission, mainly concentrated in RX (80.90%), with a degree range of 2–57 and a median of 43. Infected individuals entering the molecular network were mainly 60–69 years old, male, Han Chinese, married, heterosexual transmission, elementary school education, and FS county. The primary concentration areas of macromolecular clusters were in RX and FS, followed by GJ. The differences in the entry rate of each genotype were statistically significant, with the highest entry rate of CRF01_AE (Table 1).
HIV-1 molecular transmission networks. (A) HIV-1 subtype and gender, route of infection molecular transmission network, nodes of different colors represent different subtypes, different shapes represent different genders, different border colors represent different routes of infection, larger nodes represent higher degree values, values range 1–57. (B) molecular transmission network diagram of different domicile locations and marital status, age, nodes of different colors represent different regions, different shapes represent different marital status, the larger the node represents the older the age, value range 50–91. divorced: divorced or widowed.
Analysis of factors influencing the molecular transmission network of HIV-1
Univariate and multifactor analyses were conducted with whether or not they were enrolled in the network as the dependent variable and age, gender, ethnicity, marriage, and route of infection as the independent variables. The results of the univariate analysis were combined with professional significance, and the influencing factors with P < 0.10 were finally selected as independent variables for inclusion in the multifactor analysis. The results of multifactorial logistic regression analysis showed that older age was more likely to be in the network (60–69, OR = 1.552, 95% CI:1.103–2.184; ≥70, OR = 2.259, 95% CI:1.466–3.481), and CRF01_AE was more likely to be in the network compared to CRF07_BC and other subtypes. For details, see (Table 2).
Analysis of factors affecting individuals with high connectivity
Of the 601 subjects who entered the molecular network, 302 had a degree ≥ 6 and were defined as highly connected individuals in the molecular network. All highly connected individuals were infected through heterosexual transmission and were Han Chinese. Univariate analysis showed statistically significant differences in the distribution of highly connected individuals by age, sex, marriage, education, location, and genotype; further multivariate analysis showed that the older the age (compared with 50–59 years, 60–69 years, OR = 1.595, 95% CI: 1.026–2.479; ≥70 years, OR = 2.189, 95% CI: 1.295–3.699), two counties with household registration of RX and GJ (compared to FS county, OR = 4.654, 95% CI: 2.776–7.803; OR = 6.847, 95% CI: 3.464–13.533) and CRF08_BC subtype (compared to CRF01_AE, OR = 2.031. 95% CI: 1.225–3.367) were both more likely to be highly connected individuals, i.e., more inclined to connect six and more transmitters. High school literacy was less likely to be a highly connected individual compared to illiterate (OR = 0.262, 95% CI: 0.086–0.799), CRF07_BC subtype than CRF01_AE (OR = 0.415, 95% CI: 0.267–0.647). See Fig. 4 for details.