A state-of-the-art survey on semantic similarity for document clustering using GloVe and density-based algorithms

Semantic similarity is the process of identifying relevant data semantically. The traditional way of identifying document similarity is by using synonymous keywords and syntactician. In comparison, semantic similarity is to find similar data using meaning of words and semantics. Clustering is a conc...

Full description

Saved in:

Bibliographic Details
Main Authors:	M. Mohammed, Shapol (Author), Jacksi, Karwan (Author), R. M. Zeebaree, Subhi (Author)
Format:	EJournal Article
Published:	Institute of Advanced Engineering and Science, 2021-04-01.
Subjects:	info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion
Online Access:	Get fulltext
Tags:	Add Tag No Tags, Be the first to tag this record!


LEADER	02689 am a22003133u 4500
001	ijeecs23698_14847
042			\|a dc
100	1	0	\|a M. Mohammed, Shapol \|e author
100	1	0	\|e contributor
700	1	0	\|a Jacksi, Karwan \|e author
700	1	0	\|a R. M. Zeebaree, Subhi \|e author
245	0	0	\|a A state-of-the-art survey on semantic similarity for document clustering using GloVe and density-based algorithms
260			\|b Institute of Advanced Engineering and Science, \|c 2021-04-01.
500			\|a https://ijeecs.iaescore.com/index.php/IJEECS/article/view/23698
520			\|a Semantic similarity is the process of identifying relevant data semantically. The traditional way of identifying document similarity is by using synonymous keywords and syntactician. In comparison, semantic similarity is to find similar data using meaning of words and semantics. Clustering is a concept of grouping objects that have the same features and properties as a cluster and separate from those objects that have different features and properties. In semantic document clustering, documents are clustered using semantic similarity techniques with similarity measurements. One of the common techniques to cluster documents is the density-based clustering algorithms using the density of data points as a main strategic to measure the similarity between them. In this paper, a state-of-the-art survey is presented to analyze the density-based algorithms for clustering documents. Furthermore, the similarity and evaluation measures are investigated with the selected algorithms to grasp the common ones. The delivered review revealed that the most used density-based algorithms in document clustering are DBSCAN and DPC. The most effective similarity measurement has been used with density-based algorithms, specifically DBSCAN and DPC, is Cosine similarity with F-measure for performance and accuracy evaluation.
540			\|a Copyright (c) 2021 Institute of Advanced Engineering and Science
540			\|a http://creativecommons.org/licenses/by-nc/4.0
546			\|a eng
690
690			\|a Similarity Measurement; Evaluation Measurement; DBSCAN; Density-based algorithm; DPC GloVe word embedding
655	7		\|a info:eu-repo/semantics/article \|2 local
655	7		\|a info:eu-repo/semantics/publishedVersion \|2 local
655	7		\|2 local
786	0		\|n Indonesian Journal of Electrical Engineering and Computer Science; Vol 22, No 1: April 2021; 552-562
786	0		\|n 2502-4760
786	0		\|n 2502-4752
786	0		\|n 10.11591/ijeecs.v22.i1
787	0		\|n https://ijeecs.iaescore.com/index.php/IJEECS/article/view/23698/14847
856	4	1	\|u https://ijeecs.iaescore.com/index.php/IJEECS/article/view/23698/14847 \|z Get fulltext

A state-of-the-art survey on semantic similarity for document clustering using GloVe and density-based algorithms

Similar Items