Within epidemiology, the importance of heterogeneity, whether that host, population, statistical, or environmental, has long been recognized1,2,3,4,5. For example, when designing targeted interventions, it is crucial to understand and account for differences that may exist within populations6,7,8. These differences can present in a variety of forms: heterogeneity in susceptibility, transmission, response to guidance, and treatment effects etc.; all of which affect the dynamics of an infectious disease1,2,6,9,10,11,12,13,14. While heterogeneity may exist on a continuous spectrum, it can be difficult to incorporate into analysis and interpretation, so individuals are often placed in discrete groups according to a characteristic that aims to represent the true differences15,16,17,18,19. When examining optimal influenza vaccination policy in the United Kingdom, Baguelin et al.20 classified individuals within one of seven age groups. Explicitly accounting for, and grouping, individuals by whether they inject drugs can help target interventions to reduce human immunodeficiency virus (HIV) and Hepatitis C Virus incidence21. Similarly, epidemiological models have demonstrated the potential for HIV pre-exposure prophylaxis to reduce racial disparities in HIV incidence22. Therefore, heterogeneity can be used to inform more complete theories of change, increasing intervention effectiveness23.
When discretizing a population for the purposes of inclusion within a mechanistic model, three properties need to be defined: (1) the number of groups, (2) the size of the groups, and (3) the differences between the groups. Typically, as seen in the examples above, demographic data is used e.g., age, sex, race, ethnicity, socio-economic status, etc., often in conjunction with the contact patterns and rates7,9,15,17,20,22,24. There are several reasons for this: the data is widely available, and therefore can be applied almost universally; it is easily understandable; and there are clear demarcations of the groups, addressing properties 1) and 2). However, epidemiological models often aim to assess the effects of heterogeneity with respect to infection, e.g., “how does an individual’s risk tolerance affect their risk of infection for influenza?”. When addressing questions such as these, demographic data does not necessarily provide a direct link between the discretization method and the heterogeneous nature of the exposure and outcome, particularly if behavioral mechanisms are a potential driver. Instead, it relies on assumptions and proxy measures e.g., an individual’s age approximates their contact rates, which in turn approximates their risk of transmission. This paper demonstrates an alternative approach to discretizing populations for use within mechanistic models, highlighting the benefits of an interdisciplinary approach to characterize heterogeneity in a manner more closely related to the risk of infection.
In early 2020, shortly after the World Health Organization (WHO) declared the SARS-CoV-2 outbreak a public health emergency of international concern25, universities across the United States began to close their campuses and accommodations, shifting to remote instruction26,27. By Fall 2020, academic institutions transitioned to a hybrid working environment (in-person and online), requiring students to return to campuses28,29,30. In a prior paper31 we documented the results of a large prospective serosurvey conducted in State College, home to The Pennsylvania State University (PSU) University Park (UP) campus. We examined the effect of 35,000 returning students (representing a nearly 20% increase in the county population32) on the community infection rates, testing serum for the presence of anti-Spike Receptor Binding Domain (S/RBD) IgG, indicating prior exposure33. Despite widespread concern that campus re-openings would lead to substantial increases in surrounding community infections28,34,35, very little sustained transmission was observed between the two geographically coincident populations31.
Given the high infection rate observed among the student body (30.4% seroprevalence), coupled with the substantial heterogeneity in infection rates between the two populations, we hypothesized that there may be further variation in exposure within the student body, resulting from behavioral heterogeneity. Despite extensive messaging campaigns conducted by the University36, it is unlikely that all students equally adhered to public health guidance regarding SARS-CoV-2 transmission prevention. We use students’ responses to the behavioral survey to determine and classify individuals based on their intention to adhere to public health measures (PHMs). We then show that these latent classes are correlated with SARS-CoV-2 seroprevalence. Finally, we parameterize a mechanistic model of disease transmission within and between these groups, and explore the impact of public health guidance campaigns, such as those conducted at PSU36. We show that interventions designed to increase student compliance with PHMs would likely reduce overall transmission, but the relatively high initial compliance limits the scope for improvement via PHM adherence alone.