A Hybrid Machine Learning Model for TB/HIV Progression Prediction Using Resource-Constrained Electronic Health Record (EHR) Data in Zambia

Main Article Content

Joe Phiri
Aaron Zimba
Chiyaba Njovu

Abstract

Tuberculosis (TB) remains a leading cause of mortality among people living with HIV (PLHIV) in Zambia, posing a major challenge to an already strained health system. Zambia’s national electronic health record (EHR) systems, contains valuable longitudinal data that could support predictive tools for early TB intervention. However, issues such as data sparsity, limited analytical capacity, and poor interpretability of machine learning (ML) models have slowed clinical adoption. This study proposes a hybrid ML framework that integrates Random Forest (RF), eXtreme Gradient Boosting (XGBoost), and Long Short-Term Memory (LSTM) networks, enhanced with SHapley Additive exPlanations (SHAP) for transparency. The Design Science Research (DSR) methodology guides iterative model development, evaluation, and deployment. Preprocessing employs Multiple Imputation by Chained Equations (MICE) for missing data, Min-Max normalization for scaling, and SMOTE for class balancing. Data mapping from EHRs has been completed, and a preprocessing pipeline is under development. Initial training and validation are being conducted using synthetic EHR datasets, with performance measured by F1 Score and Area Under the Precision-Recall Curve (AUC-PR). Prototype models will be tested in simulated clinical workflows to assess feasibility and responsiveness. The research contributes a novel ensemble-based approach that fuses static and temporal variables with explainable AI, supporting early HIV/TB progression prediction and clinician trust in low-resource settings. Future work will focus on real-world validation, stakeholder feedback, and integration into national digital health systems.

Article Details

How to Cite
Phiri , J., Zimba, A., & Njovu, C. (2025). A Hybrid Machine Learning Model for TB/HIV Progression Prediction Using Resource-Constrained Electronic Health Record (EHR) Data in Zambia. Proceedings of International Conference for ICT (ICICT) - Zambia, 7(1), 88–90. Retrieved from https://ictjournal.icict.org.zm/index.php/icict/article/view/416
Section
Articles