Prediction and interpretation of antibiotic-resistance genes occurrence at recreational beaches using machine learning models

Sara Iftikhar, Asad Mustafa Karim, Aoun Murtaza Karim, Mujahid Aizaz Karim, Muhammad Aslam, Fazila Rubab, Sumera Kausar Malik, Jeong Eun Kwon, Imran Hussain, Esam I. Azhar, Se Chan Kang, Muhammad Yasir

Research output: Contribution to journalArticlepeer-review


Antibiotic-resistant bacteria and antibiotic resistance genes (ARGs) are pollutants of worldwide concern that seriously threaten public health and ecosystems. Machine learning (ML) prediction models have been applied to predict ARGs in beach waters. However, the existing studies were conducted at a single location and had low prediction performance. Moreover, ML models are “black boxes” that do not reveal their predictions' internal nuances and mechanisms. This lack of transparency and trust can result in serious consequences when using these models in high-stakes decisions. In this study, we developed a gradient boosted regression tree based (GBRT) ML model and then described its behavior using six explainable artificial intelligence (XAI) model-agnostic explanation methods. We used hydro-meteorological and qPCR data from the beaches in South Korea and Pakistan and developed ML prediction models for aac (6′-lb-cr), sul1, and tetX with 10-fold time-blocked cross-validation performances of 4.9, 2.06 and 4.4 root mean squared logarithmic error, respectively. We then analyzed the local and global behavior of the developed ML model using four interpretation methods. The developed ML models showed that water temperature, precipitation and tide are the most important predictors for prediction of ARGs at recreational beaches. We show that the model-agnostic interpretation methods not only explain the behavior of the ML model but also provide insights into the behavior of the ML model under new unseen conditions. Moreover, these post-processing techniques can be a debugging tool for ML-based modeling.

Original languageEnglish
Article number116969
JournalJournal of Environmental Management
StatePublished - 15 Feb 2023


  • Antibiotic resistance genes
  • Artificial intelligence
  • Black box models
  • Explainable
  • Machine learning
  • Recreational beaches


Dive into the research topics of 'Prediction and interpretation of antibiotic-resistance genes occurrence at recreational beaches using machine learning models'. Together they form a unique fingerprint.

Cite this