A Comparative Study of Resampling Techniques for Handling Class Imbalance in Binary Classification

Hussein Kareem Habash

doi:10.47134/ppm.v2i4.1990

Authors

Hussein Kareem Habash Ardabil University

DOI:

https://doi.org/10.47134/ppm.v2i4.1990

Keywords:

Class Imbalance, Resampling Techniques, Binary Classification Performance Metrics (AUROC, PR-AUC), Reproducible Machine Learning

Abstract

Class-imbalance skews most binary classifiers toward the majority class, hiding the very events that matter (e.g., fraud and malignancy). We present a clear, quick-to-replicate comparison of four representative resampling families—Random Over-Sampling (ROS), SMOTE, the hybrid SMOTE-ENN cleaner, and the ensemble balancer EasyEnsemble—paired with two widely used learners (Logistic Regression and Random Forest). Experiments run on two public tabular benchmarks that span extreme (0.17 % fraud) and moderate (2.3 % cancer) skew. A simple two-fold stratified split replaces heavy cross-validation, and each model is evaluated on the two metrics that matter most under imbalance: AUROC and PR-AUC. Results finish in under ten minutes on any laptop yet reproduce the qualitative hierarchy seen in much larger studies: SMOTE-ENN attains the best PR-AUC on both datasets, EasyEnsemble leads AUROC, and naïve ROS trails in every case. Three visuals—(i) an end-to-end pipeline schematic, (ii) a one-glance bar chart of class ratios, and (iii) a radar plot of mean PR-AUC scores—make the findings transparent at first sight. All code and figures come in a single Jupyter notebook (supplementary ZIP); running one command installs dependencies, and a second command reproduces every number and image. This streamlined study offers practitioners an evidence-based starting point while remaining fully reproducible for reviewers and students alike.

References

[1] M. Buda, A. Maki, and M. A. Mazurowski, “A systematic study of the class imbalance problem in convolutional neural networks,” *Neural Networks*, vol. 106, pp. 249–259, 2018. DOI: https://doi.org/10.1016/j.neunet.2018.07.011

[2] T. A. M. Elhassan, M. Aljourf, F. Al-Mohanna, and M. M. Shoukri, “Classification of imbalance data using Tomek Link combined with random under-sampling as a data reduction method,” *Global J. Technol. Optim. *, vol. 8, suppl. 1, p. 111, 2017.

[3] G. Douzas and F. Bação, “Geometric SMOTE: A geometrically enhanced drop-in replacement for SMOTE,” *Information Sciences*, vol. 501, pp. 118–135, 2019. DOI: https://doi.org/10.1016/j.ins.2019.06.007

[4] E. C. Gök and M. O. Olgun, “SMOTE-NC and gradient-boosting imputation based random-forest classifier for predicting COVID-19 severity,” *Neural Comput. Appl.*, vol. 33, pp. 15693–15707, 2021. DOI: https://doi.org/10.1007/s00521-021-06189-y

[5] A. Taskeen, S. U. R. Khan, and A. Mashkoor, “An adaptive synthetic sampling and batch generation-oriented hybrid approach for addressing class imbalance in software defect prediction,” *Soft Comput. *, vol. 28, pp. 13595–13614, 2024. DOI: https://doi.org/10.1007/s00500-024-10378-x

[6] R. Li, et al., “AS-TBR: An intrusion-detection model for smart-grid AMI,” *Sensors*, vol. 25, no. 10, Art. 3155, 2025. DOI: https://doi.org/10.3390/s25103155

[7] W. Liu, Y. Suzuki, and S. Du, “Ensemble learning algorithms based on EasyEnsemble sampling for financial distress prediction,” *Ann. Oper. Res.*, vol. 346, no. 3, pp. 2141–2172, 2025. DOI: https://doi.org/10.1007/s10479-025-06494-y

[8] G. Lemaître, F. Nogueira, and C. K. Aridas, “Imbalanced-learn: A Python toolbox to tackle the curse of imbalanced datasets in machine learning,” *J. Mach. Learn. Res.*, vol. 18, no. 17, pp. 1–5, 2017.

[9] G. Mariani, F. Scheidegger, R. Istrate, C. Bekas, and C. Malossi, “BAGAN: Data augmentation with balancing GAN,” *arXiv preprint*, arXiv:1803.09655, 2018.

[10] M. Frid-Adar, et al., “GAN-based synthetic medical image augmentation for increased CNN performance in liver lesion classification,” *Neurocomputing*, vol. 321, pp. 321–331, 2018. DOI: https://doi.org/10.1016/j.neucom.2018.09.013

[11] A. B. Qasim, et al., “Red-GAN: Attacking class imbalance via conditioned generation—Yet another medical imaging perspective,” in *Proc. MIDL 2020*, PMLR 121, pp. 655–668, 2020.

[12] G. Almeida and F. Bação, “Counterfactual synthetic minority oversampling technique: Solving healthcare’s imbalanced learning challenge,” *Data Sci. Manag. *, vol. 5, Art. 100137, 2025. DOI: https://doi.org/10.1016/j.dsm.2025.01.006

[13] A. Dal Pozzolo, O. Caelen, R. Johnson, and G. Bontempi, “Calibrating probability with undersampling for unbalanced classification,” in Proc. IEEE SSCI, Cape Town, South Africa, Dec. 2015, pp. 159–166. DOI: https://doi.org/10.1109/SSCI.2015.33

[14] W. H. Wolberg, W. N. Street, and O. Mangasarian, “Wisconsin diagnostic breast cancer data set (UCI Machine-Learning Repository),” Univ. Wisconsin, Madison, WI, USA, Tech. Rep., 1995.

[15] S. Al-Shamkhani, J. Falie, Z. He, and R. Missous, “LeakDB: A benchmark dataset for hydrocarbon leak detection using distributed acoustic sensing,” Sensors, vol. 24, no. 2, Art. 451, 2024.

[16] F. Pedregosa et al., “Scikit-learn: Machine learning in Python,” J. Mach. Learn. Res., vol. 12, pp. 2825–2830, 2011.

[17] T. Saito and M. Rehmsmeier, “The precision–recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets,” PLoS ONE, vol. 10, no. 3, e0118432, 2015. DOI: https://doi.org/10.1371/journal.pone.0118432

[18] G. M. Weiss, “Mining with rarity: A unifying framework for rare-event detection,” in Proc. ACM KDD, Seattle, WA, USA, 2004, pp. 7–15. DOI: https://doi.org/10.1145/1007730.1007734

[19] D. J. Hand and C. Anagnostopoulos, “A better β for the balanced accuracy,” Pattern Recog. Lett., vol. 115, pp. 38–43, 2018.

[20] S. García and F. Herrera, “An extension on ‘Statistical comparisons of classifiers over multiple data sets’ for all pairwise comparisons,” J. Mach. Learn. Res., vol. 9, pp. 2677–2694, 2008.

[21] J. Demšar, “Statistical comparisons of classifiers over multiple data sets,” J. Mach. Learn. Res., vol. 7, pp. 1–30, 2006.

[22] B. Efron and R. Tibshirani, An Introduction to the Bootstrap. New York, NY, USA: Chapman & Hall, 1993. DOI: https://doi.org/10.1007/978-1-4899-4541-9

[23] J. D. Hunter, “Matplotlib: A 2-D graphics environment,” Comput. Sci. Eng., vol. 9, no. 3, pp. 90–95, 2007. DOI: https://doi.org/10.1109/MCSE.2007.55

[24] A. Dal Pozzolo, “Credit Card Fraud Detection Dataset,” Kaggle, Dataset, 2015. [Online]. Available: https://www.kaggle.com/datasets/mlg-ulb/creditcardfraud

[25] F. E. Soriano, “Breast Cancer Data Set,” Kaggle, Dataset, 2022. [Online]. Available: https://www.kaggle.com/datasets/fedesoriano/breast-cancer-data-set