A taxonomy of Malay social media text

In this paper, we proposed a preliminary taxonomy of Malay social media text. Performing text analytics on Malay social media text is a challenge. The formal Malay language follows specific spelling and sentence construction rules. However, the Malay language used in social media differs in both asp...

Full description

Saved in:
Bibliographic Details
Main Authors: Maskat, Ruhaila (Author), Munarko, Yuda (Author)
Format: EJournal Article
Published: Institute of Advanced Engineering and Science, 2019-10-01.
Subjects:
Online Access:Get fulltext
Tags: Add Tag
No Tags, Be the first to tag this record!
LEADER 02376 am a22003013u 4500
001 ijeecs19898_13013
042 |a dc 
100 1 0 |a Maskat, Ruhaila  |e author 
100 1 0 |e contributor 
700 1 0 |a Munarko, Yuda  |e author 
245 0 0 |a A taxonomy of Malay social media text 
260 |b Institute of Advanced Engineering and Science,   |c 2019-10-01. 
500 |a https://ijeecs.iaescore.com/index.php/IJEECS/article/view/19898 
520 |a In this paper, we proposed a preliminary taxonomy of Malay social media text. Performing text analytics on Malay social media text is a challenge. The formal Malay language follows specific spelling and sentence construction rules. However, the Malay language used in social media differs in both aspects. This impedes the accuracy of text analytics. Due to the complexity of Malay social media text, many researches has chosen to focus on classifying the formal Malay language. To the best of our knowledge, we are the first to propose a formal taxonomy for Malay text in social media. Narrow and informal categorisations of Malay social media text can be found amidst efforts to pre-process social media text, yet cherry-picked only some categories to be handled. We have differentiated Malay social media text from the formal Malay language by identifying them as Social Media Malay Language or SMML. They consists of spelling variations, Malay-English mix sentence, Malay-spelling English words, slang-based words, vowel-les words, number suffixes and manner of expression.This taxonomy is expected to serve as a guideline in research and commercial products. 
540 |a Copyright (c) 2019 Institute of Advanced Engineering and Science 
540 |a http://creativecommons.org/licenses/by-nc/4.0 
546 |a eng 
690
690 |a Data preprocessing, Malay language, Social media, Taxonomy, Text analytics 
655 7 |a info:eu-repo/semantics/article  |2 local 
655 7 |a info:eu-repo/semantics/publishedVersion  |2 local 
655 7 |2 local 
786 0 |n Indonesian Journal of Electrical Engineering and Computer Science; Vol 16, No 1: October 2019; 465-472 
786 0 |n 2502-4760 
786 0 |n 2502-4752 
786 0 |n 10.11591/ijeecs.v16.i1 
787 0 |n https://ijeecs.iaescore.com/index.php/IJEECS/article/view/19898/13013 
856 4 1 |u https://ijeecs.iaescore.com/index.php/IJEECS/article/view/19898/13013  |z Get fulltext