Knowledge discovery from gene expression dataset using bagging lasso decision tree

Classifying high-dimensional data are a challenging task in data mining. Gene expression data is a type of high-dimensional data that has thousands of features. The study was proposing a method to extract knowledge from high-dimensional gene expression data by selecting features and classifying. Las...

Full description

Saved in:
Bibliographic Details
Main Authors: Sa'adah, Umu (Author), Rochayani, Masithoh Yessi (Author), Astuti, Ani Budi (Author)
Format: EJournal Article
Published: Institute of Advanced Engineering and Science, 2021-02-01.
Subjects:
Online Access:Get fulltext
Tags: Add Tag
No Tags, Be the first to tag this record!
LEADER 02495 am a22003133u 4500
001 ijeecs22058_14652
042 |a dc 
100 1 0 |a Sa'adah, Umu  |e author 
100 1 0 |e contributor 
700 1 0 |a Rochayani, Masithoh Yessi  |e author 
700 1 0 |a Astuti, Ani Budi  |e author 
245 0 0 |a Knowledge discovery from gene expression dataset using bagging lasso decision tree 
260 |b Institute of Advanced Engineering and Science,   |c 2021-02-01. 
500 |a https://ijeecs.iaescore.com/index.php/IJEECS/article/view/22058 
520 |a Classifying high-dimensional data are a challenging task in data mining. Gene expression data is a type of high-dimensional data that has thousands of features. The study was proposing a method to extract knowledge from high-dimensional gene expression data by selecting features and classifying. Lasso was used for selecting features and the classification and regression tree (CART) algorithm was used to construct the decision tree model. To examine the stability of the lasso decision tree, we performed bootstrap aggregating (Bagging) with 50 replications. The gene expression data used was an ovarian tumor dataset that has 1,545 observations, 10,935 gene features, and binary class. The findings of this research showed that the lasso decision tree could produce an interpretable model that theoretically correct and had an accuracy of 89.32%. Meanwhile, the model obtained from the majority vote gave an accuracy of 90.29% which showed an increase in accuracy of 1% from the single lasso decision tree model. The slightly increasing accuracy shows that the lasso decision tree classifier is stable. 
540 |a Copyright (c) 2021 Institute of Advanced Engineering and Science 
540 |a http://creativecommons.org/licenses/by-nc/4.0 
546 |a eng 
690 |a Computer science; High-dimensional statistical modeling; Ensemble learning 
690 |a Bagging; Decision tree; Feature selection; Gene expression; High-dimensional 
655 7 |a info:eu-repo/semantics/article  |2 local 
655 7 |a info:eu-repo/semantics/publishedVersion  |2 local 
655 7 |2 local 
786 0 |n Indonesian Journal of Electrical Engineering and Computer Science; Vol 21, No 2: February 2021; 1151-1159 
786 0 |n 2502-4760 
786 0 |n 2502-4752 
786 0 |n 10.11591/ijeecs.v21.i2 
787 0 |n https://ijeecs.iaescore.com/index.php/IJEECS/article/view/22058/14652 
856 4 1 |u https://ijeecs.iaescore.com/index.php/IJEECS/article/view/22058/14652  |z Get fulltext