In response to emerging public health threats, it is not uncommon for surveillance systems to be biased towards vulnerable subpopulations, which is exemplified by the multinational Zika outbreaks during 2015-201616,17,18,19,20. In this study, we proposed a Bayesian hierarchical modeling framework to account for differential reporting of Zika cases among women of reproductive age in comparison to other age-sex groups, which provided more accurate understanding of the transmission burden and associated risk modifiers of Zika. Our analysis suggested that the detection proportion of symptomatic cases in Colombia increased from 8.5% to 86% in four months since the outbreak began, and detection rates for women of reproductive age reached nearly 100%. This high detection rate among women of reproductive age aligns with the implementation of targeted screening and testing aimed at preventing microcephaly and Congenital Zika Syndrome in Colombia and other affected regions9,10,11.
After adjusting for surveillance bias towards women of reproductive age using information contained in the data, our findings indicate that females remained more susceptible to Zika symptomatic infection than males. This is consistent with previous studies, which reported higher attack rates among females3,10,21,22,23,24,25,26,27, attributed not only to higher reporting rates but also to real biological effects10, as evidenced by seroprevalence data24. Higher attack rates among women outside reproductive age groups further support this hypothesis25. The potential for sexual transmission of Zika is likely another factor contributing to the high disease burden in women. One study identified an increased infection risk if Zika infection for sexually active women aged 15–65 years, but not for other age groups10. Another study suggested the possibility that women are more likely to exhibit symptomatic infections than men26.
We found that ambient temperature was positively associated with Zika infection risk, consistent with most previous studies28,29,30,31,32,33, though one study reported no such association34. Temperature influences the ecological suitability of the vector, Aedes aegypti, and consequently, Zika transmission30. Conversely, higher precipitation was associated with lower transmission risks, in line with two prior studies29,33. However, conflicting evidence also exists: one study suggested rainfall might increase Zika risk28, while another reported a non-linear relationship34.
Departments with higher dengue incidence from 2011 to 2015 were associated with higher Zika infection risk. Several factors could have contributed to this association. First, dengue and Zika share the same mosquito vectors, implying that similar ecological and environmental conditions support their co-circulation; consequently, departments with greater vector abundance tend to have higher risks of both dengue and Zika. Another possible reason is the misdiagnosis of dengue case as Zika cases and vice versa as a result of their similarity in symptoms (e.g., fever, rash and myalgia) and antibody responses. The challenges in clinical and laboratory diagnosis may create an epidemiological picture where high dengue areas appear to also have high Zika incidence simply because of misdiagnosis14. While misdiagnosis could be associated with under-reporting or over-reporting, its impact on the parameter estimation of our transmission model is likely limited, as long as the misdiagnosis level did not vary much across space-time or age-sex groups. Misdiagnosis could be relatively high during the early phase of the epidemic due to either technological barrier or lack of awareness; however, we assumed a linear growth in the reporting rates before November 30, 2015, which partially alleviates the possibility of under-reporting due to misclassification of Zika cases as dengue cases. Finally, similarity in antigenicity between the two viruses could introduce complex interference into population-level transmission dynamics of Zika. Prior results on the dengue-Zika relationship are mixed: one study reported no impact of dengue incidence on Zika9, while another suggested that dengue antibodies enhanced Zika infections35. In addition, three studies found that prior dengue infection provided short-term protection against Zika36,37,38, yet another study suggested minimal impact of pre-existing dengue immunity39. A modeling study highlighted the complexity of the dengue-Zika relationship which might be influenced by their reproductive numbers40.
In our study, only 4% of cases were laboratory-confirmed, and the majority were clinically diagnosed. The potential of misclassification of non-Zika cases as Zika cases or vice versa will have similar impacts as the potential misdiagnosis between dengue and Zika cases on our result. If the misclassification was uniform across space-time and across sex-age groups, it should have very limited effects on our estimates. If the misclassification was differential, e.g., if certain demographic groups are more likely to have non-Zika cases misclassified as Zika cases, it could bias the estimated predictor effects as well as reporting probabilities in unpredictable ways.
Geographic factors also played an important role in Zika transmission. Departments with lower forest coverages were associated with higher infection risk, consistent with findings that cities exhibit greater risk than rural areas31. Altitude showed a non-linear relationship with Zika risk: transmission risk was the lowest in low-altitude regions but highest in medium-altitude regions, which was probably related to the role of altitude in ecological niches of the vector. This aligns with studies showing minimal Zika risk at high altitudes41, though some reported a negative association between altitude and infection risk42. Population density similarly exhibited a non-linear relationship. Low population density was associated with reduced infection risk, while medium density correlated with slightly increased risk. Our findings are largely consistent with previous studies reporting a positive relationship between Zika risk and population density42,43, suggesting that densely populated urban areas experienced greater risks of Zika.
The model-corrected attack rates are ~10–20% higher (relative scale) than the reported values across various regions and departments. These findings are broadly consistent with those reported by Moore et al. 13, who found that the attack rates were generally less than 1%. However, discrepancies in a few departments likely reflect differences in data sources and case definitions; while Moore et al. employed serological data that capture both symptomatic and asymptomatic infections, our analysis is based on suspected symptomatic cases, with only a fraction confirmed by laboratory tests.
Our approach categorized the population into six age-sex groups, enabling the estimation of three parameters related to age-specific susceptibility to symptomatic infection (corresponding to the three age groups: 0–14, 15–39, and 40+ years) and two parameters related to sex-specific susceptibility to symptomatic infection (male and female). To account for surveillance bias across age-sex groups, we included two additional parameters: one representing the reporting probability for females of reproductive age (15–39 years), and another for all other age-sex groups combined. A finer grouping of reporting probability will result in identifiability issues. Such identifiability issue could be alleviated with additional data, e.g., serosurvey studies in Colombia with age-sex grouping. Via a literature search, we found very limited number of such studies and most of these studies were targeting a special group of people or a special region44,45,46, insufficient to inform a finer grouping or improve inference for our model.
Our study has limitations. First, we conducted the analysis at the department level, without accounting for finer spatial scales, which could improve result accuracy; yet the computational burden will increase exponentially as we model pairwise transmission among spatial units. Second, we lacked direct abundance data for the Zika vector, relying instead on proxy predictors, such as environmental factors. Finally, as an observational study, the possibility of unmeasured confounders remains.
In conclusion, our study introduced a statistical approach to correct surveillance biases caused by the clinical association of Zika infection with Congenital Zika Syndrome. We estimated that almost all cases among women of reproductive age were detected, compared to 86% for other age-sex groups. Our method can be generalized to other infectious diseases with similar surveillance bias, e.g., the hand, foot and mouth disease that causes neurological complications more often in young children than in older children or adults47. These findings underscore the need for statistical and epidemiological methods adjusting for reporting biases to better understand and manage emerging infectious diseases and call for timely collection of relevant data, such as serosurvey data, that can further help correcting surveillance bias.