Cyberbullying identification in twitter using support vector machine and information gain based feature selection

Cyberbullying is one of the actions that violate the ITE Law where the crime is committed on social media applications such as Twitter. This action is difficult to detect if no one is reporting the tweet. Cyberbullying tweet identification aims to classify tweets that contain bullying. Classificatio...

Full description

Saved in:

Bibliographic Details
Main Authors:	Dwi Purnamasari, Ni Made Gita (Author), Fauzi, M. Ali (Author), Indriati, Indriati (Author), Dewi, Liana Shinta (Author)
Format:	EJournal Article
Published:	Institute of Advanced Engineering and Science, 2020-06-01.
Subjects:	info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion
Online Access:	Get fulltext
Tags:	Add Tag No Tags, Be the first to tag this record!


LEADER	02780 am a22003253u 4500
001	ijeecs15105_13778
042			\|a dc
100	1	0	\|a Dwi Purnamasari, Ni Made Gita \|e author
100	1	0	\|e contributor
700	1	0	\|a Fauzi, M. Ali \|e author
700	1	0	\|a Indriati, Indriati \|e author
700	1	0	\|a Dewi, Liana Shinta \|e author
245	0	0	\|a Cyberbullying identification in twitter using support vector machine and information gain based feature selection
260			\|b Institute of Advanced Engineering and Science, \|c 2020-06-01.
500			\|a https://ijeecs.iaescore.com/index.php/IJEECS/article/view/15105
520			\|a Cyberbullying is one of the actions that violate the ITE Law where the crime is committed on social media applications such as Twitter. This action is difficult to detect if no one is reporting the tweet. Cyberbullying tweet identification aims to classify tweets that contain bullying. Classification is done using Support Vector Machine method where this method aims to find the dividing hyperplane between negative and positive class. This study is a text classification where more data is used, the more features are produced, therefore this research also uses Information Gain as feature selection to select features that are not relevant to the classification. The process of the system starts from text preprocessing with tokenizing, filtering, stemming and term weighting. Then perform the information gain feature selection by calculating the entropy value of each term. After that perform the classification process based on the terms that have been selected, and the output of the system is identification whether the tweet is bullying or not. The result of using SVM method is accuracy 75%, precision 70.27%, recall 86.66% and f-measure 77.61% on experiment maximum iteration = 20, λ = 0.5, γ = 0.001, ε = 0.000001, and C = 1. The best threshold of information gain is 90%, with accuracy 76.66%, precision 72.22%, recall 86.66% and f-measure 78.78%.
540			\|a Copyright (c) 2020 Institute of Advanced Engineering and Science
540			\|a http://creativecommons.org/licenses/by-nc/4.0
546			\|a eng
690			\|a Computer and Informatics
690			\|a Cyberbullying; Text classification; Support vector machine; Information gain
655	7		\|a info:eu-repo/semantics/article \|2 local
655	7		\|a info:eu-repo/semantics/publishedVersion \|2 local
655	7		\|2 local
786	0		\|n Indonesian Journal of Electrical Engineering and Computer Science; Vol 18, No 3: June 2020; 1494-1500
786	0		\|n 2502-4760
786	0		\|n 2502-4752
786	0		\|n 10.11591/ijeecs.v18.i3
787	0		\|n https://ijeecs.iaescore.com/index.php/IJEECS/article/view/15105/13778
856	4	1	\|u https://ijeecs.iaescore.com/index.php/IJEECS/article/view/15105/13778 \|z Get fulltext

Cyberbullying identification in twitter using support vector machine and information gain based feature selection

Similar Items