Characterization and forecast of global influenza subtype dynamics

Categories: Disease & Virus

March 12, 2026

FluNet data

FluNet publishes weekly global data on influenza testing by country, reporting the number of samples tested and their results—positive or negative—categorized by influenza A subtypes, B lineages and surveillance type (sentinel, non-sentinel or undefined) (‘Initial data processing’ in the Supplementary Information)^1,26. Here we use for brevity H1 for A/H1N1 and A/H1N1pdm and H3 for A/H3N2. For each country, we determine influenza B infections by summing B\Yamagata, B\Victoria and unspecified B samples. We grouped the prepandemic A/H1N1 and A/H1N1pdm. The unsubtyped influenza A samples were distributed between H1 and H3 according to the respective proportions of these subtypes among the subtyped A samples that week. If no A samples were subtyped that week, we looked at the proportions of H1 and H3 in the 5 weeks centred around the week or in the year. We aggregated influenza B, H1 and H3 samples over 1 year and calculated the respective percentages to define the relative abundance of the three influenza strains. Log-ratio transformations are not defined when any component equals zero. We thus replaced zeros with small values through a Geometric Bayesian-multiplicative treatment (‘Initial data processing’ in the Supplementary Information). The year considered in the analyses starts from week 17. As both the 2009 influenza and the COVID-19 pandemic began in spring, we set the year’s start in the spring to align southern temperate countries with northern temperate countries in terms of when the two pandemics affected the respective influenza seasons. The exact week was chosen by looking at global incidence profiles and minimizing the risk of splitting an epidemic across years (Supplementary Fig. 1). After verifying that subtypes’ proportions were similar in sentinel and non-sentinel data (Supplementary Fig. 2), we aggregated the data regardless of the surveillance type. In all analyses, we discarded countries/years with fewer than 50 positive samples. In the sensitivity analysis, we tested alternative datasets using a different cut-off week for the definition of the year (week 37 instead of 17) and a threshold of 500 cases instead of 50, for country inclusion. We also quantified the impact of the uncertainty on the subtype composition due to the sample size.

i.l.r. transformation

We provide here the exact expression of the i.l.r. transformation used to map compositions in unbounded euclidean coordinates. The three-component vectors of country/year subtype abundances (B%, H1% and H3%) were transformed into two-dimensional vectors (u, v) by using the following equations

$$\left\{\begin{array}{l}u=\sqrt{\frac{2}{3}}\mathrm{ln}\frac{\rm{B} \% }{\sqrt{{\rm{H}}1 \% \cdot {\rm{H}}3 \% }}\\ v=\sqrt{\frac{1}{2}}{\mathrm{ln}}\frac{{\rm{H}}1 \% }{{\rm{H}}3 \% }\end{array}\right..$$

(1)

Definition of the mixing score

The mixing score is defined as the distance between the point in the log-ratio plane (u, v), representing the subtype composition, and the boundary of the codominance region, taken with a positive sign when the point is within that region and with a negative sign otherwise. The boundary of the codominance region in the simplex is identified by the points (B%, H1% and H3%) for which one component corresponds to exactly 50%. As an example, the points such that H1% = 50%, are mapped by the i.l.r. transformation into the coordinates (u, v(u)) such that $v\left(u\right)=\sqrt{2}(\mathrm{ln}({{\rm{e}}}^{u\sqrt{3/2}}+\sqrt{{{\rm{e}}}^{u\sqrt{6}}+4})-\mathrm{ln}2)$.

Geographical predictors of annual trajectories of subtype compositions

We considered the 86 countries satisfying our inclusion criteria for the period 2010–2019. We defined the distances between pairs of country trajectories as the average Euclidean distance of the corresponding points in the (u,v) plane. We gathered the number of air-travel passengers between countries, countries’ average temperatures, relative humidities and influenza seasons, among the NH, the SH and YR. Then, we computed three continuous covariate matrices—the differences in temperatures and relative humidity and the air travel distance between country pairs⁶⁷—and one categorical variable matrix describing epidemic synchrony, where country pairs were synchronous, that is, NH–NH, SH–SH or YR–YR, asynchronous (that is, NH–SH) or semisynchronous (that is, SH–YR or NR–YR). We then tested the association between trajectory distance and covariates using both univariable and multivariable analysis. In the ‘Geographical patterns: additional methods’ of the Supplementary Information, we detail the data sources for covariates, the definition of covariates, the regression analyses and the sensitivity analyses performed.

Clustering of trajectories

We applied Ward’s linkage hierarchical clustering⁶⁸ to cluster countries with similar trajectories. Hierarchical clustering provides a hierarchy of nested partitions. We used the Silhouette coefficients⁶⁹ to choose the optimal number of clusters, that is, the level of hierarchy. The optimal partition grouped countries into two groups. Robustness checks and sensitivity analyses (including alternative clustering algorithms) are described in the ‘Robustness checks and sensitivity analyses’ of the Supplementary Information.

Forecasting of trajectories

We tested five forecasting algorithms to predict next year’s subtype composition in each country. We attempted to predict observables of different nature: (1) the multilabel categorical variable corresponding to the dominance state (dominance of B, H1, H3 or codominance), (2) the subtype composition and (3) the binary variables corresponding to the dominance (yes or no) and the negligibility (yes or no) of each subtype. Predictions of the three types of observable were associated with a quantification of the uncertainty in the prediction and a metric for prediction evaluation. However, not all observables, quantifications of uncertainty and prediction evaluations were computable with all prediction methods. The five forecasting algorithms and the evaluation metrics used in the main paper were introduced in the ‘Results’. We provide additional details hereafter. A complete summary of what can or cannot be computed across predicted observables and prediction methods is reported in the Supplementary Table 1.

Algorithms for trajectory forecasting

We studied annual bivariate trajectories of compositions of the form (y₁, …, y_T), such that y_t = (u,v)′_t = i.l.r.((B%, H1%, H3%)′_t), with t = 1,…,T and the prime symbol indicating transposed. We forecasted y_T+1, that is the composition of 2017, 2018 or 2019, each time using the preceding period as the training set, that is (y₁, …, y_T) with y₁ the composition in 2010. Below, we present the five forecasting methods.

M1 ‘past frequencies’

The dominance state was predicted to be the most frequently observed state during the preceding years if that state was unique. Otherwise, it was randomly chosen with uniform probability among the dominant states observed most often.

M2 ‘H1–H3 alternation’

The prediction for the y_T+1 composition was ŷ_T+1 = (u,−v)′_T, that is, in percentage form, (B%_T+1 = B%_T, H1%_T+1 = H3%_T, H3%_T+1 = H1%_T)′.

M3 ‘average composition’

The predicted composition was ${\mathbf{\widehat{y}}}_{T+1}=\frac{1}{T}{\sum }_{t=1}^{T}\mathbf{y}_{t}$, and the prediction’s confidence interval was given by the covariance matrix ${\widehat{\varSigma}}=\frac{1}{T(T-1)}{\sum}_{t=1}^{T}({\mathbf{y}}_{t}-{\widehat{\mathbf{y}}}_{T+1}){\prime} ({\mathbf{y}}_{t}-{\widehat{\mathbf{y}}}_{T+1})$.

M4 ‘VAR’

Assuming that the trajectory has been generated by a VAR process of lag p, then composition t can be written as a linear function of the previous p compositions: $\mathbf{y}_{t}=\boldsymbol{\nu} +{A}^{\left(1\right)}\mathbf{y}_{t-1}+…+{A}^{\left(p\right)}\mathbf{y}_{t-p}+\mathbf{\upepsilon }_{t}$, where ${\boldsymbol{\nu}}$ is the vector intercept, ${A}^{\left(i\right)}$ are 2 × 2 coefficient matrices and ε is the Gaussian noise. This can be rewritten in matrix form as $Y={BZ}+U$ (ref. ³⁷), where

$$\begin{array}{l}Y=\left(\mathbf{y}_{p+1},…,\mathbf{y}_{T}\right),\\ B=\left(\boldsymbol{\nu} ,{A}^{\left(1\right)},…,{A}^{\left(p\right)}\right),\\ {Z}_{t}=\left(\begin{array}{l}1\\ \mathbf{y}_{p}\\ \vdots \\ \mathbf{y}_{1}\end{array}\right)\\ Z=\left({Z}_{p},…,{Z}_{T-1}\right),\\ U=\left(\mathbf{\upepsilon }_{p+1},…,\mathbf{\upepsilon }_{T}\right).\end{array}$$

The resulting least squares estimator is $\widehat{B}={YZ}{\prime} {\left({ZZ}{\prime} \right)}^{-1}$. It follows that the prediction of the composition T + 1 is $\mathbf{\widehat{y}}_{T+1}={{BZ}}_{T}$, and the prediction’s confidence interval is estimated via the empirical covariance matrix corrected for short time series $\widehat{\varSigma }=\frac{T-p+1}{\left(T-p\right)\left(T-3p-1\right)}\left({YY}{\prime} -{YZ}{\prime} \left({ZZ}{\prime} \right){ZY}{\prime} \right)$.

We only considered VAR processes with p = 1, since for the shortest trajectories (prediction of 2017, with training set between 2010 and 2016), when p > 1, the estimator of the covariance matrix is not defined.

M5 ‘HVAR’

For all three forecasted years, the trajectories of the training set were clustered as described above. This yielded two groups that were almost identical to group I and group II obtained for the whole period 2010–2019 and shown in Fig. 3. Specifically, the clustering was stable and the same as in Fig. 3 for 82 out of 86 countries while it was not stable for Czechia, Greece and Russia. Qatar clustered alone for all three forecasted years and, due to its outlier behaviour, was excluded from the forecast performance evaluation. For each cluster, we assumed that each trajectory followed a VAR process, such that the process for country c is $\mathbf{y}_{t,c}=\boldsymbol{\nu}_{c}+{A}_{c}^{\left(1\right)}\mathbf{y}_{t-1,c}+…+{A}_{c}^{\left(p\right)}\mathbf{y}_{t-p,c}+\mathbf{\varepsilon}_{t,c}$. Similar to the previous model, the coefficients can be encoded in the matrix ${B}_{c}=\left(\mathbf{\nu}_{c},{A}_{c}^{\left(1\right)},…,{A}_{c}^{\left(p\right)}\right)$ and the Gaussian noise in the matrix ${U}_{c}$. Then, the hierarchical structure is imposed by assuming that the VAR processes for the trajectories in the group are similar. Specifically, we define ${B}_{c}=W+{V}_{c}$, with W being the matrix of coefficients encoding the average behaviour of the group, which is the same for all the trajectories in the group, and V_c being the coefficient matrix for the single trajectory adjustment. Moreover, we assume that elements in W, V_c and U_c are independent random variables, sampled from distributions parametrized by some latent variables. All model equations are shown in ‘Forecast: additional methods’ of the Supplementary Information. The model coefficients were optimized by maximum likelihood estimation using Monte Carlo methods. In particular, since for each coefficient it was possible to write the model’s likelihood conditional on the other coefficients, efficient estimation was performed using a Gibbs sampler. For the conditional likelihood distributions and the code to implement the Gibbs sampler, we followed³⁸, modifying their model to introduce the intercept terms. From the Monte Carlo chains, for each country, we sampled the distribution of predictions ŷ_T+1. Then, from those samples, we computed the mean prediction and the covariance matrix for the confidence intervals. We tested HVAR models with p = 1 and p = 2.

Metrics for evaluating the goodness of predictions

To enable model comparison, the evaluation scores of different countries/years were averaged to obtain a single value for the model performance. We associated uncertainty with the forecast evaluation by computing a confidence interval with bootstrap. This corresponded to the standard error of the mean of the evaluation metrics over 200 bootstrap samples of the set of predictions. The values reported in Table 1 correspond to the averages of the dominance state accuracy and energy score values for different countries/years, whereas the values in Table 2 are averages of the country/year AUROC scores.

To compare dominance state predictions (Table 1), we defined the dominance state accuracy as the percentage of dominance states correctly predicted. To compare predictions of compositions (Table 1) we instead relied on probabilistic forecasting evaluation scores. Specifically, the energy score compares the composition $\mathbf{y}\in {{{{\mathbb{R}}}}}^{2}$ corresponding to the observation with the forecast distribution F defined by N samples $\mathbf{X}_{1},…,\mathbf{X}_{N},\mathbf{X}_{i}\in {{{{\mathbb{R}}}}}^{2}$. The formula for the energy score (denoted by ‘ES’ below) is

$$\mathrm{ES}\left(F,\mathbf{y}\right)=\frac{1}{N}\mathop{\sum }\limits_{i=1}^{N}\mathrm{||}\mathbf{X}_{i}-\mathbf{y}\mathrm{||}-\frac{1}{2{N}^{2}}\mathop{\sum }\limits_{i=1}^{N}\mathop{\sum }\limits_{j=1}^{N}\mathrm{||}\mathbf{X}_{i}-\mathbf{X}_{\!j}\mathrm{||}.$$

It is worth noting that this is a multivariate generalization of the more common continuous rank probability score, often used in epidemiology³⁹. Moreover, in the case of point forecast, it coincides with the mean absolute error. Thus, it can also be used for evaluating the M2 ‘H1–H3 alternation’ method, for which we do not have a confidence interval. The score is negatively oriented, so that it decreases when the forecast improves, and has a lower bound of zero for an ideal model. Furthermore, it is a strictly proper score, that is, designed to be optimal only when the forecast distribution coincides with the true distribution of the observations⁴⁰. Binary predictions—that is, dominance (yes/no) and negligibility (yes/no) of each subtype—were evaluated through the AUROC score (Table 2). It quantifies the overlap between the positive and negative classes and ranges from 0 to 1: 1 for an ideal model where the classes are perfectly separated, 0.5 for a random guess and 0 for the worst possible model that misclassifies all observations.

For the M5 ‘HVAR’ model, we tested lag = 1 and lag = 2. The results for the M5 ‘HVAR’ model in all tables are obtained using the lag with the lowest energy score for each group of countries, that is, lag = 1 for group II countries and lag = 2 for group I countries.

Other evaluation metrics were tested in the sensitivity analysis. They are described in Supplementary Table 1, while the results of these metrics are reported in Supplementary Tables 5 and 6.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Characterization and forecast of global influenza subtype dynamics

FluNet data

i.l.r. transformation

Definition of the mixing score

Geographical predictors of annual trajectories of subtype compositions

Clustering of trajectories

Forecasting of trajectories

Algorithms for trajectory forecasting

M1 ‘past frequencies’

M2 ‘H1–H3 alternation’

M3 ‘average composition’

M4 ‘VAR’

M5 ‘HVAR’

Metrics for evaluating the goodness of predictions

Reporting summary

Share:

Get RawNews Daily

Characterization and forecast of global influenza subtype dynamics

Teams comfirmed for last-16 tie

JFK Jr. and Daryl Hannah Photos, Never-Before-Seen Camping Trip

Australia allows higher sulphur fuel imports to protect supply security

Raw News

Health

Finance

Sports

Get RawNews Daily

© 2025 RawNews. All Rights Reserved.