An Enhanced Deep Learning Model for Duplicate Question Detection on Quora Question pairs using Siamese LSTM
- Title
- An Enhanced Deep Learning Model for Duplicate Question Detection on Quora Question pairs using Siamese LSTM
- Creator
- Chandra M.; Rodrigues A.; George J.
- Description
- The question answering platform Quora has millions of users which increases the probability of questions asked with similar intent. One question may be structured in two different ways by two users, and answering similar questions repeatedly impacts user experience. Manual filtration of such questions is a tedious task, so Quora attempts to detect and remove these duplicate questions by using the Random Forest Model, which is not completely effective. As Quora contains question answers in the form of text data, different Natural Language Processing techniques are used to transform the text data into numerical vectors. In this research, the log loss metric acts as the primary metric to evaluate different models. The primary contribution is that the Siamese network is used to process two questions parallelly and find vectors representation of each question. The vectors computed by this method enables similarity detection which is more effective than existing models. GloVe word embedding is used to understand the semantic similarity between two questions. The random classifier is built as the base model and logistic regression, linear SVM and XGBoost model are used to reduce the log loss. Finally, a Siamese LSTM is proposed which reduces the loss dramatically. 2022 IEEE.
- Source
- IEEE International Conference on Distributed Computing and Electrical Circuits and Electronics, ICDCECE 2022
- Date
- 2022-01-01
- Publisher
- Institute of Electrical and Electronics Engineers Inc.
- Subject
- FuzzyWuzzy; GloVe; Logistic Regression; Quora; Siamese Network; SVM; XgBoost
- Coverage
- Chandra M., Christ (Deemed to Be University), Department of Data Science, India; Rodrigues A., Christ (Deemed to Be University), Department of Data Science, India; George J., Christ (Deemed to Be University), Department of Data Science, India
- Rights
- Restricted Access
- Relation
- ISBN: 978-166548316-2
- Format
- Online
- Language
- English
- Type
- Conference paper
Collection
Citation
Chandra M.; Rodrigues A.; George J., “An Enhanced Deep Learning Model for Duplicate Question Detection on Quora Question pairs using Siamese LSTM,” CHRIST (Deemed To Be University) Institutional Repository, accessed February 24, 2025, https://archives.christuniversity.in/items/show/20303.