Oversampling Method To Handling Imbalanced Datasets Problem In Binary Logistic Regression Algorithm

The class imbalance is a condition when one class has a higher percentage than the other then it can affect the accuracy. One method in data mining that can be used to classification is logistic regression method. The method used in this research is RWO-sampling method using random replicate approac...

Full description

Saved in:
Bibliographic Details
Main Authors: Ustyannie, Windyaning (Author), Suprapto, Suprapto (Author)
Format: EJournal Article
Published: IndoCEISS in colaboration with Universitas Gadjah Mada, Indonesia., 2020-01-31.
Subjects:
Online Access:Get Fulltext
Get Fulltext
Tags: Add Tag
No Tags, Be the first to tag this record!
LEADER 02335 am a22003013u 4500
001 IJCSS_37415
042 |a dc 
100 1 0 |a Ustyannie, Windyaning  |e author 
100 1 0 |e contributor 
700 1 0 |a Suprapto, Suprapto  |e author 
245 0 0 |a Oversampling Method To Handling Imbalanced Datasets Problem In Binary Logistic Regression Algorithm 
260 |b IndoCEISS in colaboration with Universitas Gadjah Mada, Indonesia.,   |c 2020-01-31. 
500 |a https://jurnal.ugm.ac.id/ijccs/article/view/37415 
520 |a The class imbalance is a condition when one class has a higher percentage than the other then it can affect the accuracy. One method in data mining that can be used to classification is logistic regression method. The method used in this research is RWO-sampling method using random replicate approach for synthetic data generation on descrete attribute. The result of the research can handle the problem of class imbalance, RWO-sampling method with random replicate approach shows better accuracy than RWO-sampling method with roulette and ROS approach. The accuracy value for RWO-Sampling method with roulette and RWO-Sampling approach with random replicate approach has increased to an average of 15.55% of each dataset. As for comparithem with the ROS method has increased an average of 3.7% of each dataset. Furthermore, for testing the underfitting problem in logistic regression, the oversampling method is better than non-oversampling with an increase in accuracy value reaching an average of 2.3% of each dataset. 
540 |a Copyright (c) 2020 IJCCS (Indonesian Journal of Computing and Cybernetics Systems) 
540 |a http://creativecommons.org/licenses/by-sa/4.0 
546 |a eng 
690 |a Computer Science 
690 |a Imbalanced Datasets; RWO-Sampling; Logistic Regression 
655 7 |a info:eu-repo/semantics/article  |2 local 
655 7 |a info:eu-repo/semantics/publishedVersion  |2 local 
655 7 |2 local 
786 0 |n IJCCS (Indonesian Journal of Computing and Cybernetics Systems); Vol 14, No 1 (2020): January; 1-10 
786 0 |n 2460-7258 
786 0 |n 1978-1520 
787 0 |n https://jurnal.ugm.ac.id/ijccs/article/view/37415/26952 
856 4 1 |u https://jurnal.ugm.ac.id/ijccs/article/view/37415  |z Get Fulltext 
856 4 1 |u https://jurnal.ugm.ac.id/ijccs/article/view/37415/26952  |z Get Fulltext