Oversampling Method To Handling Imbalanced Datasets Problem In Binary Logistic Regression Algorithm

The class imbalance is a condition when one class has a higher percentage than the other then it can affect the accuracy. One method in data mining that can be used to classification is logistic regression method. The method used in this research is RWO-sampling method using random replicate approac...

Full description

Saved in:
Bibliographic Details
Main Authors: Ustyannie, Windyaning (Author), Suprapto, Suprapto (Author)
Format: EJournal Article
Published: IndoCEISS in colaboration with Universitas Gadjah Mada, Indonesia., 2020-01-31.
Subjects:
Online Access:Get Fulltext
Get Fulltext
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The class imbalance is a condition when one class has a higher percentage than the other then it can affect the accuracy. One method in data mining that can be used to classification is logistic regression method. The method used in this research is RWO-sampling method using random replicate approach for synthetic data generation on descrete attribute. The result of the research can handle the problem of class imbalance, RWO-sampling method with random replicate approach shows better accuracy than RWO-sampling method with roulette and ROS approach. The accuracy value for RWO-Sampling method with roulette and RWO-Sampling approach with random replicate approach has increased to an average of 15.55% of each dataset. As for comparithem with the ROS method has increased an average of 3.7% of each dataset. Furthermore, for testing the underfitting problem in logistic regression, the oversampling method is better than non-oversampling with an increase in accuracy value reaching an average of 2.3% of each dataset.
Item Description:https://jurnal.ugm.ac.id/ijccs/article/view/37415