Modified balanced random forest for improving imbalanced data prediction

This paper proposes a Modified Balanced Random Forest (MBRF) algorithm as a classification technique to address imbalanced data. The MBRF process changes the process in a Balanced Random Forest by applying an under-sampling strategy based on clustering techniques for each data bootstrap decision tre...

Full description

Saved in:
Bibliographic Details
Main Authors: Agusta, Zahra Putri (Author), Adiwijaya, Adiwijaya (Author)
Other Authors: PT Telkom Indonesia, Graduated School Telkom University (Contributor)
Format: EJournal Article
Published: Universitas Ahmad Dahlan, 2019-03-31.
Subjects:
Online Access:Get Fulltext
Get Fulltext
Tags: Add Tag
No Tags, Be the first to tag this record!
LEADER 02456 am a22003013u 4500
001 0 nhttps:__ijain.org_index.php_IJAIN_article_downloadSuppFile_255_56
042 |a dc 
100 1 0 |a Agusta, Zahra Putri  |e author 
100 1 0 |a PT Telkom Indonesia, Graduated School Telkom University  |e contributor 
700 1 0 |a Adiwijaya, Adiwijaya  |e author 
245 0 0 |a Modified balanced random forest for improving imbalanced data prediction 
260 |b Universitas Ahmad Dahlan,   |c 2019-03-31. 
500 |a https://ijain.org/index.php/IJAIN/article/view/255 
520 |a This paper proposes a Modified Balanced Random Forest (MBRF) algorithm as a classification technique to address imbalanced data. The MBRF process changes the process in a Balanced Random Forest by applying an under-sampling strategy based on clustering techniques for each data bootstrap decision tree in the Random Forest algorithm. To find the optimal performance of our proposed method compared with four clustering techniques, like: K-MEANS, Spectral Clustering, Agglomerative Clustering, and Ward Hierarchical Clustering. The experimental result show the Ward Hierarchical Clustering Technique achieved optimal performance, also the proposed MBRF method yielded better performance compared to the Balanced Random Forest (BRF) and Random Forest (RF) algorithms, with a sensitivity value or true positive rate (TPR) of 93.42%, a specificity or true negative rate (TNR) of 93.60%, and the best AUC accuracy value of 93.51%. Moreover, MBRF also reduced process running time. 
540 |a Copyright (c) 2019 Zahra Putri Agusta, Adiwijaya Adiwijaya 
540 |a https://creativecommons.org/licenses/by-sa/4.0 
546 |a eng 
690 |a Imbalanced data; Random forest algorithm; Balanced random forest ; Customer churn; Classification technique 
655 7 |a info:eu-repo/semantics/article  |2 local 
655 7 |a info:eu-repo/semantics/publishedVersion  |2 local 
655 7 |2 local 
786 0 |n International Journal of Advances in Intelligent Informatics; Vol 5, No 1 (2019): March 2019; 58-65 
786 0 |n 2548-3161 
786 0 |n 2442-6571 
787 0 |n https://ijain.org/index.php/IJAIN/article/view/255/ijain_v5i1_p58-65 
787 0 |n https://ijain.org/index.php/IJAIN/article/downloadSuppFile/255/56 
856 4 1 |u https://ijain.org/index.php/IJAIN/article/view/255/ijain_v5i1_p58-65  |z Get Fulltext 
856 4 1 |u https://ijain.org/index.php/IJAIN/article/downloadSuppFile/255/56  |z Get Fulltext