Classifying AI-generated summaries And Human Summaries Based on Statistical Features
- Title
- Classifying AI-generated summaries And Human Summaries Based on Statistical Features
- Creator
- Mathews D.; Varghese J.P.; Samuel L.C.
- Description
- In an age where artificial intelligence knows no bounds, it's crucial to know if the textual content is reliable. But, the task of identifying AI-generated content within vast volumes of textual data is a big challenge. The existing studies in feature-based classification only explored prompt-based text responses. This paper explores methods to identify AI-generated summaries using feature-based machine-learning techniques. This study uses the BBC News Summary dataset. The summaries for the dataset are then generated using three of the top-performing summarisation models. Different statistical features like Zipf's Law Score, Flesch Reading Ease Score, and the Gunning Fog Index are used for extracting features for the classification model. The aim is to differentiate AI-generated summaries from human-written summaries. The main part of the study involves extracting the statistical features from the summarized texts, which are then classified using different classification models. Different models like Support Vector Machine (SVM), Random Forest, Decision Tree, and Logistic Regression models are used in the paper. Grid Search is also used to fine-tune SVM for the best results. The right model depends on what the need is. Whether it's accuracy, F1 score, or a mix of both, there are different options to lead you to the truth. The feature-based approach in this paper helps in more explainable classification and can compare how statistical text features are different for human-written summaries and generated summaries. 2024 IEEE.
- Source
- TQCEBT 2024 - 2nd IEEE International Conference on Trends in Quantum Computing and Emerging Business Technologies 2024
- Date
- 2024-01-01
- Publisher
- Institute of Electrical and Electronics Engineers Inc.
- Subject
- AI-generated summaries; and feature-based techniques; classification models; Grid Search technique; Logistic Regression; NLG classification; Random Forest; SVM
- Coverage
- Mathews D., Department of Data Science, Christ (Deemed to Be University), Lavasa, India; Varghese J.P., Department of Data Science, Christ (Deemed to Be University), Lavasa, India; Samuel L.C., Department of Data Science, Christ (Deemed to Be University), Lavasa, India
- Rights
- Restricted Access
- Relation
- ISBN: 979-835038427-7
- Format
- Online
- Language
- English
- Type
- Conference paper
Collection
Citation
Mathews D.; Varghese J.P.; Samuel L.C., “Classifying AI-generated summaries And Human Summaries Based on Statistical Features,” CHRIST (Deemed To Be University) Institutional Repository, accessed February 24, 2025, https://archives.christuniversity.in/items/show/19151.