Predicting Climate Change Related Extreme Natural Disasters Using Machine Learning in Zambia
Main Article Content
One of the most important concerns affecting humanity today is climate change that has led to increased frequency of natural disasters that threaten social and economic stability to populations. Zambia’s vulnerability to the threat of disasters remains high because the country still lacks an effective Early Warning System (EWS). This study recognises the need to evaluate various Machine Learning (ML) algorithms, that have been successfully implemented in disaster prediction, in order to develop a model for Zambia. Six ML algorithms, namely; Logistic Regression (LR), Random Forest (RF), K-Nearest Neighbor (KNN), Gaussian Naive Bayes (GNB), Decision Tree (DT), and Support Vector Machine (SVM), have been compared from which the best performing is chosen. The historical climate data is obtained from the Zambia Meteorological Department (ZMD) while historical natural disasters data was obtained online because it is not locally available. The study results show that LR and SVM algorithms performed better than the others, both scoring 73.0% accuracy, respectively. LR is chosen to produce the final model because it has a shorter computational time compared to SVM. The model is then incorporated in a web service and android application for deployment. However, the high number of outliers, missing values and highly imbalanced classes affect the performance of the model. ML data cleaning and feature engineering techniques, such as Data Imputation and Oversampling Techniques, are applied but certain challenges still persist because these tools have their own flaws. Therefore, the model’s performance in a real-world data environment is likely to be affected.