Ensemble Method for Indonesian Twitter Hate Speech Detection

Due to the massive increase of user-generated web content, in particular on social media networks where anyone can give a statement freely without any limitations, the amount of hateful activities is also increasing. Social media and microblogging web services, such as Twitter, allowing to read and...

Full description

Saved in:

Bibliographic Details
Main Authors:	Fauzi, M. Ali (Author), Yuniarti, Anny (Author)
Format:	EJournal Article
Published:	Institute of Advanced Engineering and Science, 2018-07-01.
Subjects:	info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion
Online Access:	Get fulltext
Tags:	Add Tag No Tags, Be the first to tag this record!


LEADER	02822 am a22003013u 4500
001	ijeecs10638_8785
042			\|a dc
100	1	0	\|a Fauzi, M. Ali \|e author
100	1	0	\|e contributor
700	1	0	\|a Yuniarti, Anny \|e author
245	0	0	\|a Ensemble Method for Indonesian Twitter Hate Speech Detection
260			\|b Institute of Advanced Engineering and Science, \|c 2018-07-01.
500			\|a https://ijeecs.iaescore.com/index.php/IJEECS/article/view/10638
520			\|a Due to the massive increase of user-generated web content, in particular on social media networks where anyone can give a statement freely without any limitations, the amount of hateful activities is also increasing. Social media and microblogging web services, such as Twitter, allowing to read and analyze user tweets in near real time. Twitter is a logical source of data for hate speech analysis since users of twitter are more likely to express their emotions of an event by posting some tweet. This analysis can help for early identification of hate speech so it can be prevented to be spread widely. The manual way of classifying out hateful contents in twitter is costly and not scalable. Therefore, the automatic way of hate speech detection is needed to be developed for tweets in Indonesian language. In this study, we used ensemble method for hate speech detection in Indonesian language. We employed five stand-alone classification algorithms, including Naïve Bayes, K-Nearest Neighbours, Maximum Entropy, Random Forest, and Support Vector Machines, and two ensemble methods, hard voting and soft voting, on Twitter hate speech dataset. The experiment results showed that using ensemble method can improve the classification performance. The best result is achieved when using soft voting with F1 measure 79.8% on unbalance dataset and 84.7% on balanced dataset. Although the improvement is not truly remarkable, using ensemble method can reduce the jeopardy of choosing a poor classifier to be used for detecting new tweets as hate speech or not.
540			\|a Copyright (c) 2018 Institute of Advanced Engineering and Science
540			\|a http://creativecommons.org/licenses/by-nc-nd/4.0
546			\|a eng
690			\|a Computer Science
690			\|a Classifier Ensemble; Hate Speech; Text Classification; Twitter; Indonesian Language;
655	7		\|a info:eu-repo/semantics/article \|2 local
655	7		\|a info:eu-repo/semantics/publishedVersion \|2 local
655	7		\|2 local
786	0		\|n Indonesian Journal of Electrical Engineering and Computer Science; Vol 11, No 1: July 2018; 294-299
786	0		\|n 2502-4760
786	0		\|n 2502-4752
786	0		\|n 10.11591/ijeecs.v11.i1
787	0		\|n https://ijeecs.iaescore.com/index.php/IJEECS/article/view/10638/8785
856	4	1	\|u https://ijeecs.iaescore.com/index.php/IJEECS/article/view/10638/8785 \|z Get fulltext

Ensemble Method for Indonesian Twitter Hate Speech Detection

Similar Items