Screening process
We downloaded 1386 sequences from GenBank in the first stage to establish genotyping criterion. A final set of 522 sequences was used after excluding 864 irrelevant sequences and those with missing segments and other anomalies.
We then downloaded 2209 sequences in the second stage to validate the genotyping criterion established using the first-stage sequences. A final set of 997 sequences, including 471, 272, and 255 sequences of the S, M, and L segments, respectively, was used after excluding 1212 irrelevant sequences and those with recombination events (Supplementary Fig. 1; Supplementary Datas 1, 2, 3, 4).
Genotyping criteria and validation
Phylogenetic analysis was performed using first-stage sequences. Sequences that clustered together in the evolutionary tree were considered to exhibit the same genotype. We classified the three segments of the same viral strain into the same genotype whenever possible. Subsequently, we performed pairwise comparisons among all first-stage sequences to calculate the nucleotide difference rate (%) between each genotype. Notably, nucleotide differences in the same segment among different viral strains within the same genotype were < 3%, whereas variations between viral strains of different genotypes ranged from 3 to 7%. Therefore, a 3% cut-off value was selected for genotypic classification. Ultimately, a new genotyping criterion for DBV was established by combining the phylogenetic tree analysis and the 3% cut-off value of nucleotide differences (Figs. 1, 2, 3).
Genetic evolutionary tree analysis of first-stage DBV sequences. (a) The genetic evolutionary tree analysis of S segments. (b) The genetic evolutionary tree analysis of M segments. (c) The genetic evolutionary tree analysis of L segments. (d) The genetic evolutionary tree analysis of nucleoprotein (NP). (e) All ORFs. Each genotype was labelled with a different colour. ▲ indicates that the S, M, and L segments belong to different genotypes. ORF: open reading frame.
The majority of first-stage sequences were classified into six genotypes (S1–6, M1–6, and L1–6) using a 3% cut-off value (Figs. 2, 3). We concatenated all ORFs and performed a phylogenetic analysis. Figures 1e and 3 present the results of the phylogenetic and homology analyses, respectively. The results of the concatenated analysis were generally consistent with those obtained using the analysis of the individual segments.
Nucleotide differences among first-stage sequences. (a–d) Intra-genotype nucleotide differences in the S segment (a), M segment (b), L segment (c), and NP region (d). (e–h) Inter-genotype nucleotide differences in the S segment (e), M segment (f), L segment (g), and NP region (h). (i–k) Intra- and inter-genotype nucleotide differences in S (i), M (j), and L (k) segments of DBV reassortant strains. N represents genotypes 7–11. B and W indicate intra-genotype and inter-genotype, respectively.
We identified three viral strains with distinct sequences within the S segment. Consequently, two of these strains were named S7, and the other two were named S8. We determined nucleotide differences of 4.47% (3.27–5.22%) between S7 and viruses with other genotypes, whereas it was 4.82% (3.65–5.39%) between S8 and viruses with other genotypes (Fig. 4a). Similar to the S segment, the nucleotide differences between these newly designated genotypes (M7–11 and L7–11) and the other genotypes were > 3% (Fig. 4b,c). Ultimately, 8, 11, and 11 genotypes in the S, M, and L segments were identified, respectively, in the first stage based on the new genotyping criteria. Notably, most strains exhibited the same genotype across all three segments. However, nine strains exhibited the same genotype in the two segments and a different genotype in the third segment (Supplementary Tables 1, 2).
Inter-genotype nucleotide differences in S, M, and L segments in new genotypes among the first stage. (a) Inter-genotype nucleotide differences in the S segment (S7, S8). (b) Inter-genotype nucleotide differences in the M segment (M7–11). (c) Inter-genotype nucleotide differences in the L segment (L7–11) (d) Nucleotide difference between sequences belonging to different genotypes in the four genotyping methods.
To further validate the established genotyping criterion, we used the same methodology to identify the genotypes using the second-stage sequences (Supplementary Table 3). Notably, we obtained consistent results with < 3% of nucleotide differences within genotypes and > 3% differences between different genotypes (Fig. 5; Supplementary Fig. 2). Through validation in the second phase, we confirmed the reliability of the genotypic classification standards. Furthermore, the S segment in 19 strains exhibited significant dissimilarity (nucleotide differences of > 3%) compared with the other strains, indicating that they belong to a new genotype that is distinct from the genotypes S1 to S8. Consequently, these strains were denoted as N1 (17 sequences) and N2 (2 sequences), with inter-genotype nucleotide differences of 3.67% (2.69–4.70%) and 4.07% (3.10–4.70%), respectively (Fig. 5). Although a few viral strains displayed notable nucleotide diversity in the same genotypes (nucleotide differences > 3%), such as the S3, S8, and M5 genotypes, significant distinctions persisted between different genotypes.
Nucleotide differences in the second stage. (a–d) Intra-genotype nucleotide differences in S segment (a), M segment (b), L segment (c), and NP region (d). (e–h) Inter-genotype nucleotide differences in S segment (e), M segment (f), L segment (g), and NP region (h). “No” denotes that no more than two viral strains were included in the analysis.
Virus segment reassortment
The genotypes of the S, M, and L segments of most virus strains were identical. However, these segments belonged to different genotypes in the nine viral strains, indicating segment reassortment. All the reassorted strains shared the same genotype for the two segments, with only one segment belonging to a different genotype. For instance, the S, M, and L segments of SD4 corresponded to genotypes S1, M9, and L9, respectively. The nucleotide differences between the SD4 and genotypes S1 or S2–8 were 2.81% (2.52–3.22%) or 4.30% (3.44–4.60%), respectively, with a statistically significant difference (P < 0.001). Similar results were obtained for strains JS4, AHL, SPL087A, LN2012-14, LN2012-34, LN2012-41, LN2012-42, and LN2012-58 (P < 0.001), suggesting that the three segments belonging to different genotypes may be associated with reassortment (Fig. 2; Supplementary Tables 1, 2).
Comparison of various genotyping methods for DBV
We analysed the same first-stage sequences using three genotyping methods from previous studies and the method from this study to further confirm the rationality of the established genotyping criterion7,11,12,13,14. Notably, the same dataset analysed using different genotyping methods yielded distinct outcomes for each genomic segment. In the S segment, notable differences were observed among S1, S4, S6, and S8. In the M segment, variations were most prominent in M4, M8, M10, and M11. In the L segment, the disparities were primarily noted in L4, L10, and L11 (Table 1). Therefore, we further calculated the nucleotide difference rates of the above sequences that were notably different between our method and previous methods (Fig. 4d; Supplementary Fig. 3). In Fig. 4d, S1–S1 exhibited nucleotide differences between two subsets (2.96%, 2.63–3.33%), which were categorised into one genotype using our method but were categorised as two separate genotypes using previous typing methods (one group comprised 2 sequences, whereas another comprised 56 sequences). Notably, a similar result was obtained for S4. Conversely, some sequences that were suggested to belong to two different genotypes using our method were categorised into the same genotype using other genotyping methods. For instance, the inter-type differences between S6–8 were 3.80% (3.50–4.10%). These sequences were classified into two genotypes based on a 3% cut-off value. Overall, the comparison of the results of various genotyping methods for DBV confirmed the specificity and rationality of our genotyping criteria.
Establishment of reference sequences and validation of genotype-specific primers
We established reference sequences and specific sites for genotypes 1–6 following the principles of reference sequence construction and specific site screening (Supplementary Data 5; Supplementary Table 4). These sites were used to design genotype-specific primers to distinguish types S1–6 (Supplementary Table 5). Plasmids containing the full-length S1–6 sequences were used as templates for amplification using genotype-specific primers. Hence, we established a conventional one-step RT-PCR method based on specific primers to detect the DBV genotype. Among the amplification outcomes using the six pairs of type-specific primers, only one distinct band was clearly visible using agarose gel electrophoresis, with the observed band size corresponding to the expected size, confirming the genotype of the serum samples (Fig. 6). The original agarose gel electrophoresis images are presented in Supplementary Data 6. Forty serum samples (S1: 21 cases, S3: 4 cases, S4: 7 cases, and S5: 8 cases) were included in the study. The samples were sequenced in parallel, and the results were consistent with the aforementioned experimental data. Overall, the one-step RT-PCR method could facilitate rapid identification of clinical genotypes and confirm the accuracy and rationality of our genotyping standards.
Verification of the specificity of the primer sequences. (a–f) The plasmids containing S1(a), S2(b), S3(c), S4(d), S5(e), S6(f) reference sequences were used as templates to verify six primer pairs, respectively. (g–j) The serum samples containing DBV types 1(g), 3(h), 4(i), and 5(j) were used as templates to verify six primer pairs.
Simple typing method
Considering the significance of the NP region within the DBV and its manageable length of approximately 700 base pairs, it is suitable for PCR amplification and sequencing. Therefore, we employed NP region sequence analysis to establish a straightforward genotyping method to conveniently identify clinical genotypes. Subsequently, a simple typing method based on the NP region, phylogenetic tree analysis, and a 3% cut-off value was employed separately for the NP region of all strains from both the first and second stages. The results revealed a high degree of consistency (99.2%) between the genotypic outcomes obtained using the analysis of the NP region and full-length S segment (Figs. 1, 4; Supplementary Table 6). Notably, in the second stage, the following discrepancies between S segment genotypes and NP-region simple genotypes were observed in four sequences: S3 → G4 (KU664009), N1 → G3 (JQ693002, MT114317), and N2 → G4 (KU664010, KR698328) (Supplementary Fig. 3).
Regional genotypic differences
The genotyping criterion established in this study was used to ascertain the distribution of DBV genotypes across diverse regions. The results revealed pronounced regional disparities in genotypes across China, Korea, and Japan. In China, genotype distribution has been statistically analysed by province. In Henan, the main genotype observed was S1, with S3 and S4 as supplementary genotypes. Genotypes S1 and S3 were predominantly observed in Hubei and Shandong, S1 and S4 were predominantly observed in Jiangsu, S5 was the dominant in Zhejiang, along with S1, S4, and S6, S2 in Japan, with other genotypes being relatively rare, and S2 and S5 in South Korea. Notably, similar patterns were observed for the M and L segments of the DBV. Therefore, regional differences in DBV genotypes could also be identified using our genotyping criterion (Fig. 7; Supplementary Fig. 4).
Genotypic distribution in different regions of all sequences in first and second stages. (a) Geographical distribution and names of the virus strains included in this study. (b–e) The genotypic distribution of S segment (b), M segment (c), L segment (d) and NP region (e). LN Liaoning, SD Shandong, HN Henan, HB Hubei, AH Anhui, JS Jiangsu, ZJ Zejiang, SK Korea, JP Japan.
Genotypes and clinical characteristics
We investigated the association between different DBV genotypes and clinical characteristics based on our genotyping criterion using 40 cases. The proportion of patients with neurological symptoms was as follows: genotype 1 (7/21), genotype 3 (0/4), genotype 4 (1/7), and genotype 5 (5/8). The proportion of patients experiencing vomiting symptoms was as follows: genotype 1 (1/21), genotype 3 (1/4), genotype 4 (0/7), and genotype 5 (4/8). No significant differences were observed in gastrointestinal bleeding or diarrhoea between the genotypes. Therefore, we hypothesise that clinical symptoms, particularly neurological and vomiting symptoms, may be associated with infection with different DBV genotypes, indicating that a specific and reasonable genotyping criterion is crucial for the clinical treatment and evaluation of disease progression following DBV infection.