Artificial Intelligence model to predict resistances in Gram-negative bloodstream infections

Categories: Disease & Virus

May 30, 2025

Observational cohort study on all consecutive adult patients hospitalized at our center and diagnosed with GN-BSI, from January 1st 2013 to December 31st 2019. Patients were excluded if on palliative care, if death occurred within 48 h from index BSI and when clinical data were incomplete or unavailable.

The study was conducted according to the declaration of Helsinki and Good Clinical Practice guidelines and approved by the local Ethics Committee (no. 894/2021/Oss/AOUBo). Research ethics board approval was obtained in agreement with Comitato Etico Area Vasta Emilia Centro della Regione Emilia-Romagna (CE-AVEC). Informed consent was obtained before enrollment.

Data sources and predictor variables

Patients were screened for enrolment using local microbiology registries. Clinical charts and hospital electronic records were data sources. Data were gathered using a dedicated REDCap electronic case report form (eCRF) hosted by Alma Mater Studiorum – University of Bologna¹⁸.

The primary endpoint was antibiotic resistance to four different antibiotic classes including FQ-R, 3GC-R, BL/BLI-R and C-R. Beta-lactam/beta-lactamase inhibitors included amoxicillin/clavulanate and piperacillin/tazobactam for Enterobacterales, only piperacillin/tazobactam for Pseudomonas spp.

Exposure variables included demographic (i.e., age, gender), diabetes (uncomplicated disease or end-organ disease), congestive heart failure, dementia, chronic obstructive pulmonary disease (COPD), chronic kidney disease (CKD), liver disease, solid organ tumor (localized or metastatic), comorbidities according to Charlson comorbidity index¹⁹, presence of immunosuppressive conditions (hematopoietic cell transplantation, neutropenia, solid organ transplantation, HIV, corticosteroids therapy), length of hospital stay (LOS) from hospital admission to index BSI, BSI acquisition source (hospital or community acquired) along with inpatient ward (i.e., internal medicine, intensive care unit-ICU, Surgery, Emergency department). BSI sources, defined according to US Centers for Disease Control and Prevention criteria²⁰ were also registered. BSI was defined as “primary” in case of unidentified source of infection. Data about microbiological strains were summarized into Enterobacterales (Klebsiella spp, Escherichia coli, Enterobacter spp.) and non-fermentative Gram-negative (NF-GN) (Pseudomonas spp., Acinetobacter spp). We also took record of rectal swab colonization at BSI onset. Correlation heatmap among variables are shown in Supplementary Fig. 6a, b.

Data analysis

The analyses were carried out within a machine learning framework, developed using the scikit-learn Python package. The problem here posed fell in the category of classification tasks, since the aim was to predict the resistance or susceptibility of a given pathogen to four antibiotic families evaluating clinical and demographic features. A multivariable logistic classifier has been used for this purpose since it represents a well-calibrated model for binary classification. A comparative analysis has been carried out to evaluate the most predictive model among an extreme gradient boosting classifier, a multi-layer perceptron and a logistic regression. Although each model produced robust and consistent performances, the logistic regression model has been chosen in this study not only because of its interpretability (especially when compared to a black box model as the multi-layer perceptron), but in particular because of its higher accuracy in predicting the resistant class of the four antibiotics. The machine learning workflow consists of a One Versus Rest (OVR) framework that allows to train the multivariable logistic classifier so that it learns to classify each pathogen as resistant or susceptible to the four antibiotic classes. The model was trained within a nested cross validation (CV) to avoid overfitting and ensure more robust results. The purpose of the 5-fold inner CV was to fine-tune the hyperparameters of the logistic regression, i.e., the type of penalization (among no penalization, lasso, ridge and elastic-net) and the penalization factor (see Supplementary Table 1). The 10-fold outer CV evaluated the robustness of the model to training and test splitting, since the validation metrics were computed on the test set (corresponding to 10% of the dataset) for each different split. A sketch of the nested CV workflow is presented in Supplementary Fig. 7.

Before training the model, a pre-processing step was required; in particular, after a one-hot encoding to obtain dummy variables for all the categorical variables present in the dataset, the two continuous variables, i.e., the age and the length of hospital stay of the patient, underwent the procedure of feature scaling through standardization.

As already stated above, in addition to a non-penalized model, different regularization techniques were also considered: the L1 penalization (lasso), the L2 penalization (ridge), and a balanced mix between the two (elastic-net).

After training, the models were validated using three metrics: Area Under Receiver-Operating Curve (AUROC), weighted F1-score and Matthews Correlation coefficient being the most common choices when validating a binary classifier (especially if the dataset contains unbalanced classes) since they each provide a different insight on model performance.

As already described above, this work considers four antibiotics classes. The first step of the machine learning framework was to train a logistic regression for each one of these classes independently within a nested cross validation framework. More specifically, for each antibiotics class, data were split into 10 folds of the outer CV. The inner CV is instead a 5-folds.

Thus, for each antibiotics class, 10 values for each of three considered metrics (one for each iteration of the outer CV) were obtained, allowing to determine a variability measure (standard deviation) of the metrics over the 10 iterations.

Once the model was trained, the coefficients for each feature have been extracted to see the impact of each variable on the outcome of the model. Since each logistic regression is trained 10 times (for a 10-folds CV), each feature has been associated with 10 coefficients that have been summarized using their mean and standard deviation.

Finally, we evaluated the positive predictive values (PPV) and the negative predictive values (NPV) of our model, and specifically the false omission rate (FOR) for each antibiotic’s class, defined as FOR = 1 – NPV, which more accurately represents the risk of a wrong classification of a pathogen as susceptible, when it is actually resistant²¹.

Source link