Hybrid deep neural network for Bangla automated image descriptor

Automated image to text generation is a computationally challenging computer vision task which requires sufficient comprehension of both syntactic and semantic meaning of an image to generate a meaningful description. Until recent times, it has been studied to a limited scope due to the lack of visu...

Full description

Saved in:

Bibliographic Details
Main Authors:	Jishan, Md Asifuzzaman (Author), Mahmud, Khan Raqib (Author), Azad, Abul Kalam Al (Author), Alam, Md Shahabub (Author), Khan, Anif Minhaz (Author)
Format:	EJournal Article
Published:	Universitas Ahmad Dahlan, 2020-07-12.
Subjects:	info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion
Online Access:	Get Fulltext
Tags:	Add Tag No Tags, Be the first to tag this record!


LEADER	02958 am a22003133u 4500
001	IJAIN_499_ijain_v6i2_p109-122
042			\|a dc
100	1	0	\|a Jishan, Md Asifuzzaman \|e author
100	1	0	\|e contributor
700	1	0	\|a Mahmud, Khan Raqib \|e author
700	1	0	\|a Azad, Abul Kalam Al \|e author
700	1	0	\|a Alam, Md Shahabub \|e author
700	1	0	\|a Khan, Anif Minhaz \|e author
245	0	0	\|a Hybrid deep neural network for Bangla automated image descriptor
260			\|b Universitas Ahmad Dahlan, \|c 2020-07-12.
500			\|a https://ijain.org/index.php/IJAIN/article/view/499
520			\|a Automated image to text generation is a computationally challenging computer vision task which requires sufficient comprehension of both syntactic and semantic meaning of an image to generate a meaningful description. Until recent times, it has been studied to a limited scope due to the lack of visual-descriptor dataset and functional models to capture intrinsic complexities involving features of an image. In this study, a novel dataset was constructed by generating Bangla textual descriptor from visual input, called Bangla Natural Language Image to Text (BNLIT), incorporating 100 classes with annotation. A deep neural network-based image captioning model was proposed to generate image description. The model employs Convolutional Neural Network (CNN) to classify the whole dataset, while Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) capture the sequential semantic representation of text-based sentences and generate pertinent description based on the modular complexities of an image. When tested on the new dataset, the model accomplishes significant enhancement of centrality execution for image semantic recovery assignment. For the experiment of that task, we implemented a hybrid image captioning model, which achieved a remarkable result for a new self-made dataset, and that task was new for the Bangladesh perspective. In brief, the model provided benchmark precision in the characteristic Bangla syntax reconstruction and comprehensive numerical analysis of the model execution results on the dataset.
540			\|a Copyright (c) 2020 Md Asifuzzaman Jishan, Khan Raqib Mahmud, Abul Kalam Al Azad, Md Shahabub Alam, Anif Minhaz Khan
540			\|a https://creativecommons.org/licenses/by-sa/4.0
546			\|a eng
690			\|a convolutional neural network; hybrid recurrent neural network; long short-term memory; bi-directional RNN; natural language descriptors
655	7		\|a info:eu-repo/semantics/article \|2 local
655	7		\|a info:eu-repo/semantics/publishedVersion \|2 local
655	7		\|2 local
786	0		\|n International Journal of Advances in Intelligent Informatics; Vol 6, No 2 (2020): July 2020; 109-122
786	0		\|n 2548-3161
786	0		\|n 2442-6571
787	0		\|n https://ijain.org/index.php/IJAIN/article/view/499/ijain_v6i2_p109-122
856	4	1	\|u https://ijain.org/index.php/IJAIN/article/view/499/ijain_v6i2_p109-122 \|z Get Fulltext

Hybrid deep neural network for Bangla automated image descriptor

Similar Items