On the Comparison of Line Spectral Frequencies and Mel-Frequency Cepstral Coefficients Using Feedforward Neural Network for Language Identification

Of the many audio features available, this paper focuses on the comparison of two most popular features, i.e. line spectral frequencies (LSF) and Mel-Frequency Cepstral Coefficients. We trained a feedforward neural network with various hidden layers and number of hidden nodes to identify five differ...

Full description

Saved in:

Bibliographic Details
Main Authors:	Gunawan, Teddy Surya (Author), Kartiwi, Mira (Author)
Format:	EJournal Article
Published:	Institute of Advanced Engineering and Science, 2018-04-01.
Subjects:	info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion
Online Access:	Get fulltext
Tags:	Add Tag No Tags, Be the first to tag this record!


LEADER	02299 am a22003013u 4500
001	ijeecs10880_8197
042			\|a dc
100	1	0	\|a Gunawan, Teddy Surya \|e author
100	1	0	\|e contributor
700	1	0	\|a Kartiwi, Mira \|e author
245	0	0	\|a On the Comparison of Line Spectral Frequencies and Mel-Frequency Cepstral Coefficients Using Feedforward Neural Network for Language Identification
260			\|b Institute of Advanced Engineering and Science, \|c 2018-04-01.
500			\|a https://ijeecs.iaescore.com/index.php/IJEECS/article/view/10880
520			\|a Of the many audio features available, this paper focuses on the comparison of two most popular features, i.e. line spectral frequencies (LSF) and Mel-Frequency Cepstral Coefficients. We trained a feedforward neural network with various hidden layers and number of hidden nodes to identify five different languages, i.e. Arabic, Chinese, English, Korean, and Malay. LSF, MFCC, and combination of both features were extracted as the feature vectors. Systematic experiments have been conducted to find the optimum parameters, i.e. sampling frequency, frame size, model order, and structure of neural network. The recognition rate per frame was converted to recognition rate per audio file using majority voting. On average, the recognition rate for LSF, MFCC, and combination of both features are 96%, 92%, and 96%, respectively. Therefore, LSF is the most suitable features to be utilized for language identification using feedforward neural network classifier.
540			\|a Copyright (c) 2018 Institute of Advanced Engineering and Science
540			\|a http://creativecommons.org/licenses/by-nc/4.0
546			\|a eng
690
690			\|a language identification; LSF; MFCC; feedforward neural network classifier; recognition rate
655	7		\|a info:eu-repo/semantics/article \|2 local
655	7		\|a info:eu-repo/semantics/publishedVersion \|2 local
655	7		\|2 local
786	0		\|n Indonesian Journal of Electrical Engineering and Computer Science; Vol 10, No 1: April 2018; 168-175
786	0		\|n 2502-4760
786	0		\|n 2502-4752
786	0		\|n 10.11591/ijeecs.v10.i1
787	0		\|n https://ijeecs.iaescore.com/index.php/IJEECS/article/view/10880/8197
856	4	1	\|u https://ijeecs.iaescore.com/index.php/IJEECS/article/view/10880/8197 \|z Get fulltext

On the Comparison of Line Spectral Frequencies and Mel-Frequency Cepstral Coefficients Using Feedforward Neural Network for Language Identification

Similar Items