Study population
This population-based study used the database from the French national surveillance system for hospitalizations (SI-VIC)17. All individuals included in this database tested positive for COVID-19. SI-VIC includes information on individuals’ age, sex, date of hospitalization, and area of residence code i.e., the smallest geographical unit used in France for statistical information, called the IRIS (https://www.insee.fr/en/metadonnees/definition/c1523). IRIS codes comprise ~2000 inhabitants who are relatively homogenous in terms of their socioeconomic characteristics.
We included all patients hospitalized for COVID-19 from 1 July 2020 until 31 August 2022. Patients re-hospitalized more than 7 days after discharge were included as new patients. We excluded patients with missing data for area of residence, age and sex. We were not able to include patients hospitalized during the first wave of the pandemic (i.e., before July 2020) as the database was not fully operational at that time (because of a lack of coverage and missing values for IRIS). Accordingly, we investigated the temporal dynamics in social inequalities during the course of five pandemic waves, starting with the first day of the week as follows; Wave 2: 1 July-16 December 2020, Wave 3: 23 December-16 June 2021, Wave 4: 23 June–20 October 2021, Wave 5: 27 October–2 March 2022 and Wave 6: 9 March–31 August 2022. The cuts were determined based on the epidemic curve in France and the official periods during which there were sustained increases in the number of new hospitalization cases were observed, followed by a peak and then a decline.
Outcomes
We examined severe forms of COVID-19: hospitalization, ICU admission, and death during hospitalization.
Main explanatory variable
No individual social data is collected in the SI-VIC database. Accordingly, to obtain information on patients’ social deprivation, we matched this database with the 2017 French version of the European Deprivation Index (EDI)—using patients’ IRIS codes. The EDI calculation followed three steps:
First, calculation of an individual deprivation indicator from the French version of the European Union Statistics on Income and Living Conditions survey (EU-SILC) designed to study deprivation (https://ec.europa.eu/eurostat/statistics-explained/index.php?title=EU_statistics_on_income_and_living_conditions_(EU-SILC)_methodology). Second, selection of variables available both at individual level (EU–SILC survey) and in the French census population (aggregated variables at IRIS-level).
Third, a multivariate logistic regression was used to select among 11 individual explanatory variables that reflect different aspects of deprivation from EU-SILC survey, those associated with the individual deprivation indicator.
Finally, ten variables were selected to build the EDI: proportion of individuals of non-French nationality, proportion of households without a car, proportion of individuals employed as managers or intermediate professionals, proportion of single-parent families, proportion of households with at least two members, proportion of non-owner occupied households, proportion of unemployed individuals, proportion of individuals without post-secondary school education, proportion of overcrowded dwellings, and proportion of non-married individuals.
The coefficients of the regression were used as weights for the corresponding aggregated variables, to calculate the EDI. The French EDI score at IRIS-level is obtained as a weighted sum of the ten variables18. Quintiles of the national distribution of EDI scores were computed for metropolitan France, with the first quintile (Q1) representing the least deprived areas and the fifth quintile (Q5) the most deprived areas.
Statistical analysis
In this study we used two reference populations at the IRIS level: (1) The 2020 French census population; (2) The population infected with COVID-19 (testing positive) during each week obtained from the national testing information system database SI-DEP13. This enabled us to investigate not only social inequalities relating to COVID-19 exposure but also whether these social inequalities persisted among infected individuals.
First, to describe the weekly dynamics of hospitalization, of ICU admission, and of death during hospitalization according to EDI quintiles, we calculated the corresponding age and sex standardized rates (i.e., direct standardization) using the two reference populations.
Second, to investigate the association between weekly counts of hospitalizations, ICU admissions and deaths with EDI quintiles for each pandemic wave, we employed spatiotemporal Bayesian Poisson regression models. These models were used to estimate incidence rate ratio (IRR) and 95% credible interval (CrI). The model specification included spatial random effect at the IRIS level following a scaled Besag-York-Mollié (BYM2) distribution with Penalized Complexity (PC) priors19 to account for spatial dependencies between observation20,21,22. Additionally, we incorporated temporal random effect for the week, modeled using a second order random walk distribution to capture the weekly dynamics of the outcomes. The expected number of cases (i.e., age and sex indirect standardization) was included as an offset and calculated based to the two previously defined reference populations.
The model specification used in this study is described below:
Let \({y}_{o,{we},i}\) be the number of cases for outcome o, during week we, in IRIS i.
$${y}_{o,{we},i} \sim {Poisson}\left({E}_{o,{we},i}\times {\rho }_{o,{we},i}\right)$$
Where \({E}_{o,{we},i}\) is the expected number of cases for outcome o, during week we, in IRIS i (see below), and \({\rho }_{o,{we},i}\) is the relative risk of outcome o, during week we, in IRIS i.
$${\log ({\rho }_{o,{{{\rm{we}}}},i}})={\alpha }_{o}+{b}_{o,\,i}+{f}_{o}({{{\rm{we}}}})+\left({\beta }_{2,o}Q{2}_{i}+{\beta }_{3,o}Q{3}_{i}+{\beta }_{4,o}Q{4}_{i}+{\beta }_{5,o}Q{5}_{i}\right)$$
(1)
Where
-
\({\alpha }_{o}\) is the intercept for outcome o.
-
\({b}_{o,i}\) is a spatial random effect at the IRIS level i. It follows a scaled Besag-York-Mollié (BYM2) distribution19.
-
\({{{{\rm{f}}}}}_{o}\left({{{\rm{we}}}}\right)\) is a temporal random effect for week we which follows a second order random walk distribution allowing a dependency in the two preceding weeks.
-
\(Q{j}_{i}\), j = 2.5 is an index variable which equals 1 when the EDI value in IRIS code i falls within the jth quintile, and 0 otherwise; \({\beta }_{j,o}\), j = 2.5 is the coefficient associated with EDI quintile j for outcome o. Therefore, we derived the IRR associated with EDI quintile Qj relative to quintile Q1 for outcome o as:\(\,\exp \left({\beta }_{j,o}\right)\).
Expected number of cases based on the 2020 population census
For each of the three outcomes, we first used the population distribution across IRIS codes provided by the 2020 census to calculate the expected number of cases for outcome o, using age and sex indirect standardization. Specifically, the expected number of cases for outcome o, for each week we, in IRIS i, \({E}_{o,{we},i}\), was:
$${E}_{o,{we},i}={\sum }_{a\in A,s\in \{F,M\}c}\left[\frac{{y}_{o,a,s,w({we})}\times {Po}{p}_{a,s,i}}{{n}_{w({we})}\times {Po}{p}_{a,s}}\right]$$
Where \(w\left({we}\right)\) is the wave that contains week \({we}\), \({y}_{o,a,s,w({we})}\) is the cumulated number of cases of outcome o, 15-year age groups a and sex s, across wave \(w\left({we}\right)\) in metropolitan France; \({Po}{p}_{a,s,i}\) is the 2020 population in IRIS i of age group a and sex s; \({Po}{p}_{a,s}\) is the 2020 population in metropolitan France belonging to age group a and sex s; \({n}_{w({we})}\) is the number of weeks in wave \(w\left({we}\right)\); A is the age group partitioning of the population; refers to female/male partitioning.
Expected number of cases based on the number of positive cases
Second, we used the population tested positive to COVID-19 to compute the expected number of cases for outcome o in week we and IRIS i, using age and sex indirect standardization as follows:
$${E}_{o,{we},i}={\sum }_{a\in A,s\in \{F,M\}}\left[\frac{{y}_{o,a,s,w({we})}\times {Po}{s}_{a,s,w({we}),i}}{{n}_{w({we})}\times {Po}{s}_{a,s,w({we})}}\right]$$
Where \({Po}{s}_{a,s,w({we}),i}\) is the cumulated number of cases tested positive for COVID-19 over wave w(we), for age group a and sex s, in IRIS i; \({Po}{s}_{a,s,w({we})}\) is the number of cases tested positive for COVID-19 for age group a, sex s, wave w(we), cumulated over all metropolitan France, and changed for each wave.
Of note, with each standardization, \({E}_{o,{we},i}\) was constant over wave \(w({we})\).
Adjustment for vaccination rates
We were interested in measuring the “direct” effect of the EDI on COVID-19 outcomes that was not accounted for by the lack of vaccination. To do this, we hypothesized that the vaccine rate might act as a mediating factor in the relationship between the EDI and the incidence of the three outcomes.
We calculated the age and sex standardized vaccine rates (i.e direct standardization) using the cumulative number of individuals fully vaccinated (with two or three doses) during a given week and municipality which contained IRIS, and the number of inhabitants living in this municipality according to the 2021 French population census data. To note, the number of individuals vaccinated at the IRIS level was not available.
We calculated these rates according to two periods to account for the booster dose introduced on 1 September 2021 as follows:
Period 1: 26 December 2020–20 October 2021 (i.e., covering waves 3 and 4): individuals fully vaccinated against COVID-19 with two doses.
Period 2: 1 September 2021 to 31 August 2022 (i.e., covering waves 5 and 6): individuals fully vaccinated against COVID-19 with three doses, or two doses if the second injection occurred in that period.
First, we tested to see whether the EDI was negatively associated with COVID-19 standardized vaccine rates, and whether COVID-19 standardized vaccine rates were negatively associated with the incidence of the three outcomes.
Second, we added the standardized vaccine rates assumed to follow a second order random walk distribution into the previous model (Eq. (1)).
The adjusted model has the following expression:
$${\log ({\rho }_{o,{we},i}})= {\alpha }_{o}+{b}_{o,{i}}+{{{{\rm{f}}}}}_{o}({{{\rm{we}}}})+\left({\beta }_{2,o}Q{2}_{i}+{\beta }_{3,o}Q{3}_{i}+{\beta }_{4,o}Q{4}_{i}+{\beta }_{5,o}Q{5}_{i}\right)\\ +{{{{{\rm{g}}}}}_{o}({{{{\rm{Vac}}}}}_{{{{{\rm{we}}}},{{{\rm{m}}}}}_{{{{\rm{i}}}}}})}$$
(2)
We used R software version 4.2. The spatiotemporal Bayesian Poisson regression models were implemented using the package R-INLA23 package version 22.12.16 (www.r-inla.org).
Ethics
This study obtained ethical authorizations from the Comité éthique et scientifique pour les recherches les études et les évaluations dans le domaine de la santé (CESREES) 03/09/20 n°1880179bis and the Commission nationale de l’informatique et des libertés (CNIL) 18/08/21 (DSI-COVID décision DR-2021-238) demande d’autorisation n° 921160. As such, the requirement for informed consent was waived given the data used are extracted from administrative databases, and that enhanced data protection measures were implemented, including pseudonymization, to mitigate risks associated with the partially identifiable nature of the data. Access is strictly controlled and restricted to authorized researchers.