Ethical approval
The Federal University of Bahia’s Institute of Public Health Ethics Committee provided ethical approval for the study (CAAE registration number: 18022319.4.0000.5030). Informed consent was waived because the data were deidentified and analysed under strict security procedures, in accordance with the General Data Protection Law (13,709/2018), Article 7, Item IV.
Study population and data source
We conducted a population-based cohort study of live births in Brazil, among pregnant women aged 12–49 years, with estimated conception dates between January 1, 2015, and March 1, 2019. By restricting follow-up to live births observed through February 2020, we aimed to mitigate the possibility of misclassifying the exposure (i.e. classify a COVID-19 case as arbovirus infection).
The source population was drawn from the national live birth registry, the Live Birth Information System (Sistema de Informação de Nascidos Vivos, SINASC). The SINASC provided information about the mother, including maternal age at birth, education level, marital status and ethnicity, the pregnancy (prenatal appointments, parity, previous loss, length of gestation) and the newborn (birthweight, sex, Apgar score, presence of congenital anomalies). The congenital anomalies recorded in SINASC are those detected at birth and documented according to the International Classification of Diseases—10.
Data on symptomatic chikungunya, dengue and Zika records were obtained from the Notifiable Diseases Information System (Sistema de Informação de Agravos de Notificação; SINAN), which included information on the notified infections: date of symptom onset, diagnosis and criteria used for diagnosis. In Brazil, all suspected cases of chikungunya, dengue and Zika are recorded, and laboratory confirmation is required until a predefined incidence threshold is reached. Once this threshold is reached, diagnosis can be based on clinical-epidemiological criteria, considering typical symptoms occurring in the same area and timeframe as other confirmed cases and validation by epidemiological surveillance36. Detailed information on chikungunya, dengue and Zika case definitions can be found in our previously published protocol36.
Data on the cause and date of death of the live births were obtained from the Mortality Information System (Sistema de Informação sobre Mortalidade, SIM).
We excluded: (i) live births before 22 weeks or after 44 weeks of gestation or with a birthweight below 500 g; (ii) non-singleton live births; (iii) live births with missing data on the municipality of residence; (iv) live births from women with a suspected case of chikungunya, dengue or Zika in the six months before the conception date up to 6 days after conception date; (v) for the exposed women, we excluded live births from women with more than one confirmed arbovirus infection during pregnancy; and (vi) live births from women notified as suspected arbovirus infections ruled out after clinical and epidemiologic investigation, as those records are likely to be cases of other febrile diseases. Finally, we restricted the analysis to municipalities with at least five confirmed cases of at least one of the three arboviruses to prevent sparse data bias37.
Linkage process
The SINASC records were linked separately to SINAN chikungunya, SINAN dengue, SINAN Zika, and SIM records using variables such as maternal name, date of birth or age, and place of residency as matching criteria. This linkage was performed using CIDACS-Record Linkage, which combines indexing and searching algorithms to identify records from the SINASC that closely match each record in the remaining datasets. It proceeds with pairwise comparisons of candidate linking records38. The accuracy of each linkage was assessed through manual verification of a randomly selected sample of records, evaluating sensitivity and specificity indexes via receiver operating characteristic curves. More details about the linkage procedures are provided in previously published articles38,39, and the performance metrics are presented in Supplementary Fig. 3.
Exposures and outcomes
Live births from SINASC that were linked with SINAN records indicating that the mother was reported and confirmed as an arboviruses case during pregnancy (occurring between seven or more days after the estimated date of conception and the date of delivery) were considered exposed. The date of conception was estimated using the birth date minus the gestational age at birth as recorded in SINASC dataset. We used seven days after conception to reduce potential misclassification of pre-pregnancy infection due to the time of incubation of those viruses. Pregnancies not linked to a record of suspected chikungunya, dengue or Zika during pregnancy were considered unexposed. The trimester of infection during pregnancy was classified as the first trimester (pregnancy day 7–97 days), the second trimester (98–195 days), or the third trimester (196 days to birth). Supplementary Fig. 1 shows a representation of the time-varying classification of infection status.
We defined preterm birth as delivery before 37 completed weeks of gestation. The risk window for preterm birth was from 22 to 37 weeks of gestation (pregnancy day 258). Large for gestational age (LGA) and small for gestational age (SGA) at birth were defined as above the 90th and below the 10th centile of the sex-specific birthweight-for-gestational-age distribution based on the Intergrowth reference charts, respectively40. Low birth weight (LBW) was defined as infants weighting <2.5 kg. Low Apgar was defined as a score of less than 7 (maximum of 10) 5 minutes after birth. We also identified the occurrence of congenital anomalies recorded at SINASC after birth (International Classification of Diseases, 10th revision codes Q00-Q99). The risk window for LBW, LGA, SGA, congenital anomaly and Apgar 5’ < 7, it was 22 weeks to the end of pregnancy. Neonatal death was defined as death occurring within 27 days of birth.
Covariables
We considered the following to be a priori confounders based on their known or plausible causal effects on infection risk and/or adverse pregnancy outcomes (disjunctive cause criterion): mother’s age (<20,20-34, ≥ 35), education (0–3, 4–7, 8–11 or ≥12 years), previous pregnancies (0 or ≥1), number of prenatal appointments (0, 1–3, 4–6, ≥7), previous stillbirth, marital status (married/stable union or single), municipality of residence, race (white, black, mixed race, Asian and Indigenous) and year of conception. Records with missing data on the municipality of residence or on sex of the live birth were excluded (6,322 – <0.01%); missing data on any of the other confounders (695,829 −10%) was addressed using multiple imputation.
Statistical analyses
For the birth outcomes, we used a Cox model to estimate the hazard ratio (HR) with a 95% confidence interval (CI), stratified by municipality (i.e. area of residence). Infection was considered a time-varying exposure, meaning that pregnant women were classified as unexposed before infection and exposed after infection. Robust sandwich variance estimation was used to account for statistical dependence across repeated observations because of changes in exposure status. Gestational age in days was used as the timescale. In additional analyses, we explored whether associations differed by trimester using the same analysis approach described above. Statistical evidence for a difference between trimesters was assessed using Cochran’s Q test. We assessed the proportional hazards assumption using a test based on Schönfeld residuals41.
For neonatal death, we used Poisson regression with clustered (municipality) standard errors to derive risk ratios (RR) and risk difference (RD); the 95% CI for RD was estimated using the delta method.
Multiple imputation was conducted using chained equations using a fully conditional specification to generate six imputed datasets with five iterations per dataset, using random forest as the underlying model.
Sensitivity analyses
We repeated the analyses using laboratory-confirmed infection only to explore potential exposure misclassification bias due to including participants without laboratory diagnosis.
We conducted a negative control exposure analysis42, using maternal arbovirus infection that occurred between 6 and 24 months before pregnancy, under the assumption that maternal arbovirus infection before pregnancy is likely to be influenced by many of the same unmeasured potential confounders as maternal infection during pregnancy, but is not expected to have any direct causal influence on birth and neonatal outcomes.
We also assessed the consistency of the results regarding missing data. In additional analyses, we employed the missing indicator method and complete-case analysis. Each analysis considers a different type of underlying mechanism for the missing data (at random, completely at random or not at random)43.
Finally, to assess how immortal time bias could impact the results, we classified the live-born as exposed since conception if their mother had an infection at any point during the pregnancy, i.e. not defining arbovirus infection as a time-varying exposure. We derived RRs using the Poisson regression with clustered standard errors per municipality of maternal residence.
All data processing and analyses were done using R (version 4.1.1) and the tidyverse, fixest, mice, marginaleffects and survival packages.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.