A Sentence-Level Risk Estimator for Identifying Hallucinations in Generative AI
- Title
- A Sentence-Level Risk Estimator for Identifying Hallucinations in Generative AI
- Creator
- Gupta, Varuna; Chakravarti, Lakshay; Akhtar, Md Moin; Maheshwari, Prem; Garg, Puneet; Tiwari, Dimple
- Description
- Hallucination, defined as the generation of factually incorrect or ungrounded content, represents a critical challenge in large language models and summarization systems. Existing evaluation metrics often operate at the document level and fail to pinpoint erroneous sentences with sufficient granularity. This work introduces Sentence-Level Risk Estimation (SRE), a unified framework for detecting hallucinations at fine granularity by integrating three complementary signals: semantic alignment using BERT-based embedding similarity, QA-based factuality verification through question-answer pair generation and validation, and Natural Language Inference (NLI) entailment assessment using pre-trained models such as DeBERTa-MNLI. These signals are aggregated into a unified Sentence Risk Score (SRS) via weighted calibration. Experimental evaluation on CNN/DailyMail and XSum datasets demonstrates that the proposed method achieves precision of 0.85, recall of 0.75, F1-score of 0.80, and correlation with human judgments of 0.85, representing substantial improvements over existing approaches including FactCC, QAGS, and SummaC. The proposed framework enables AI systems to flag risky sentences for review or regeneration, thereby improving trust and safety in generative applications. 2026 IEEE.
- Source
- Proceedings of the 2026 International Conference on AI-Driven Smart Systems and Ubiquitous Computing, ICAUC 2026;pp.1619-1626
- Date
- 01-01-2026
- Publisher
- Institute of Electrical and Electronics Engineers Inc.
- Subject
- factuality evaluation; Hallucination detection; large language models; natural language inference; question answering; sentence-level risk estimation
- Coverage
- Gupta V., Christ University, Bengaluru, 560029, India; Chakravarti L., Vivekananda Institute of Professional Studies, School of Engineering & Technology, Technical Campus, Delhi, 110034, India; Akhtar M.M., Vivekananda Institute of Professional Studies, School of Engineering & Technology, Technical Campus, Delhi, 110034, India; Maheshwari P., Vivekananda Institute of Professional Studies, School of Engineering & Technology, Technical Campus, Delhi, 110034, India; Garg P., Kiet Group of Institutions, Ghaziabad, India; Tiwari D., Vivekananda Institute of Professional Studies, School of Engineering & Technology, Technical Campus, Delhi, 110034, India
- Rights
- Restricted Access; Hardcopy may be available in the library
- Relation
- ISBN: 979-833155851-2;
- Format
- online
- Language
- English
- Type
- Conference paper
Collection
Citation
Gupta, Varuna; Chakravarti, Lakshay; Akhtar, Md Moin; Maheshwari, Prem; Garg, Puneet; Tiwari, Dimple, “A Sentence-Level Risk Estimator for Identifying Hallucinations in Generative AI,” CHRIST (Deemed To Be University) Institutional Repository, accessed June 20, 2026, https://archives.christuniversity.in/items/show/25905.
