Study setting
We conducted our study using data from two healthcare systems in the Dallas-Fort Worth, Texas metropolitan area: Texas Health Resources (THR) and the University of Texas Southwestern Medical Center (UTSW). THR operates 29 acute care hospitals, 32 surgery centers, 18 urgent care centers, and 130 primary care clinics19 predominantly in the western region, and UTSW, a quaternary care academic health science center, serves mainly the eastern area with two acute care hospitals and 24 outpatient clinics20. Both organizations provide approximately 5.9 million visits annually to 8.4 million patients in thirteen counties, spanning over 9200 square miles19,21.
In both healthcare systems, full-service microbiology laboratories use standard methods like broth microdilution, disk diffusion, and rapid molecular detection to process cultures and determine Antimicrobial Resistance (AMR).
Data collection and processing
Study population, data collection and processing
We included all adults ( ≥ 18 years old) with at least one positive bacterial blood culture at THR or UTSW between 2006 and 2023 in our study population. We limited our analysis to patients who had documented culture specimen type and/or source. For each encounter, we included only the first blood culture for each organism. We excluded patients with missing data.
When extracting data from the health systems’ EHR (Epic (Verona, WI)) data warehouses, we linked data on demographics (e.g., gender, ethnicity, insurance, home address), past medical history, microbiology including susceptibilities, antimicrobial administration, and laboratory results using a unique patient encounter number. All pre-processing of data utilized the Python (v3.9) programming language22.
We mapped the Ethnicity, Sex, Insurance, diabetes PMH, and cardiac PMH to numerical labels. We converted the string containing the time stamp into a DateTime format. For the binary model, we grouped all AMRs and created a binary variable called ‘Resistant’. For the multi-label model, we created a dictionary map with a numerical value assigned to the specific AMR.
We used antimicrobial susceptibility to map the cultures to five resistance mechanisms:
We chose these antimicrobial resistance mechanisms because of their prevalence in antimicrobial-resistant infections. We mapped cultures to their resistance methods using information from free-text notations, comments, and antibiotic susceptibility lists. For the mapping, we used a list of common phrases (developed from clinical knowledge) that indicated the resistance mechanism (Table 4). We classified any culture that could not be mapped to any of the five mechanisms as a non-resistant infection.
To prevent counting the same infection repeatedly, any cultures for the same individual during the same encounter that contained the same organism and resistance mechanism were considered as one case counting only the first occurrence if subsequent cultures were obtained within a period of ninety days from the initial positive culture.
Mapping vulnerability indexes
Using the US Census Bureau’s Geocoder API, we first determined the census tract and block group values associated with each patient’s address23. We linked values for the patient’s census tract and block group with neighborhood socioeconomic indices, using the 2020 Social Vulnerability Index (SVI) and the 2020 Area Deprivation Index (ADI), respectively. The ADI quantifies the level of deprivation in a particular geographic area (Census Block) using seventeen US Census variables including poverty, education, housing, and employment24,25,26 while the SVI describes the vulnerability of a neighborhood (census tract) to natural, man-made, or disease-based disasters. The SVI uses sixteen US Census variables that account for the location’s socioeconomic status, household characteristics, racial and ethnic status, and housing and transportation types25. Higher values for AI and SVI indicate increased vulnerability and worse deprivation in a location.
Machine learning methods
We intended our Machine Learning (ML) models to support the empirical decisions for an antimicrobial prescription that a clinician would make during a patient encounter. Therefore, we only included features that were available at the time when the culture was originally obtained. These features included age, sex, race, ethnicity, socioeconomic indices, insurance status, and past medical history. The past medical history included diabetes, cardiovascular disease, liver disease, dementia, cancer diagnoses, hemi-/paraplegia, and prior resistant infections, prior antibiotic use, or prior immunosuppressant therapy. We intentionally kept the number of features12 small to allow for simpler models and easier application in real healthcare settings and future replication by others (Supplementary Table 1).
We used a binary classifier to predict antibiotic resistance in cultures. We also used a multi-class classifier to identify the resistance mechanism that was most likely when the binary classifier predicted a culture to be resistant. We used three supervised machine learning classifiers: penalized logistic regression, random forest, and XGBoost. We divided the dataset into training and testing sets, with the testing set comprising 20% of the entire dataset (Supplementary Tables 2 and 3).
We used L2 penalized logistic regression (Ridge regression) to tune the binary and multi-class classifiers’ regularization parameter (C). We tested the C values using a 5-fold, stratified cross-validated grid search with C values ranging from 10−3 to 103. We tuned the random forest models using a 5-fold, stratified cross-validated grid search, considering parameters such as maximum depth, number of features for the best split, quality criteria, minimum sample requirements for leaf nodes and internal nodes, and the number of estimators in the forest24. We optimized the XGBoost models using a 5-fold, stratified cross-validated grid search for parameters such as the subsample ratio of columns, the number of training instances, learning rate, tree depth, and number of estimators.
To address class imbalance in the dataset, we used balanced class weighting for the logistic regression and random forest models and the scale positive weight parameter, and the library’s documentation for the XGBoost model27.
We evaluated the performance of our binary classification models using the area under the receiver operating characteristic curve (AUC-ROC) as our primary measure. The negative predictive rate also influenced the selection of the best binary classification model. A higher negative predictive rate indicated that the model was well suited to predict an infection with a non-resistant organism, allowing the clinician to select more narrow-spectrum empiric antibiotic prescriptions. Confusion matrices, precision-recall curves, feature importance, and odds ratios for the logistic regression models were secondary measures to evaluate our classification models.
IRB approval
The Institutional Review Board (IRB) at the University of Texas Southwestern Medical Center approved our study (STU-2023-0583).