Abusive Words Detection on Reddit Comments Using Machine Learning Algorithms

Title: Abusive Words Detection on Reddit Comments Using Machine Learning Algorithms
Creator: Madhurima S.; Abhijith Ajith K.; Vs S.; Prathap B.R.
Description: Utilization of artificial intelligence contributes to the efficient examination of emotions, resulting in valuable insights into the psychological condition of users on a large scale. In this research endeavor, sentiment analysis is conducted on a dataset from Reddit, which was obtained through Kaggle. The feedback in this collection of data was divided into downbeat, neutral, and upbeat sentiments. Various machine learning techniques, like Random Forest, Extreme Gradient Boosting Classifier (XGB), Gradient Boosting Machine (GBM), Support Vector Machine (SVM), and Convolutional Neural Network (CNN), were detected and examined to assess their effectiveness in sentiment classification. The review of these techniques comprised performance criteria such as F1 Score, accuracy, precision, and recall. Additionally, confusion matrices were utilized to assess the algorithms' proficiency in identifying abusive language. The investigation's conclusions indicate that, when it comes to sentiment analysis, the random forest method performs better than any other strategy, with a maximum accuracy of 0.99 that is on par with the CNN model's accuracy of 0.98. Moreover, random forest proves to be the most effective algorithm in recognizing negative comments and abusive language. This study underscores the significance of employing machine learning algorithms in sentiment analysis, content moderation, social media monitoring, and customer feedback analysis, emphasizing their role in enhancing automated systems that aim to comprehend user sentiments in online discussions. 2024 IEEE.
Source: Proceedings - 2nd IEEE International Conference on Device Intelligence, Computing and Communication Technologies, DICCT 2024, pp. 312-317.
Date: 2024-01-01
Publisher: Institute of Electrical and Electronics Engineers Inc.
Subject: Abuse words; CNN; Gradient Boosting Machine and random forest; SVM; XGB classifier
Coverage: Madhurima S., Computer Science and Engineering Christ (Deemed to Be University), Karnataka, Bangalore, India; Abhijith Ajith K., Computer Science and Engineering Christ (Deemed to Be University), Karnataka, Bangalore, India; Vs S., Computer Science and Engineering Christ (Deemed to Be University), Karnataka, Bangalore, India; Prathap B.R., Computer Science and Engineering Christ (Deemed to Be University), Karnataka, Bangalore, India
Rights: Restricted Access
Relation: ISBN: 979-835037284-7
Format: Online
Language: English
Type: Conference paper
Identifier: https://doi.org/10.1109/DICCT61038.2024.10532806

https://www.scopus.com/inward/record.uri?eid=2-s2.0-85195168310&doi=10.1109%2fDICCT61038.2024.10532806&partnerID=40&md5=4a21e8cff5fccbe31ba79b9e5f98df78

Collection

Citation

Madhurima S.; Abhijith Ajith K.; Vs S.; Prathap B.R., “Abusive Words Detection on Reddit Comments Using Machine Learning Algorithms,” CHRIST (Deemed To Be University) Institutional Repository, accessed July 31, 2025, https://archives.christuniversity.in/items/show/19440.

Abusive Words Detection on Reddit Comments Using Machine Learning Algorithms

Collection

Citation

Output Formats