A Sentence-Level Risk Estimator for Identifying Hallucinations in Generative AI

Title: A Sentence-Level Risk Estimator for Identifying Hallucinations in Generative AI
Creator: Gupta, Varuna; Chakravarti, Lakshay; Akhtar, Md Moin; Maheshwari, Prem; Garg, Puneet; Tiwari, Dimple
Description: Hallucination, defined as the generation of factually incorrect or ungrounded content, represents a critical challenge in large language models and summarization systems. Existing evaluation metrics often operate at the document level and fail to pinpoint erroneous sentences with sufficient granularity. This work introduces Sentence-Level Risk Estimation (SRE), a unified framework for detecting hallucinations at fine granularity by integrating three complementary signals: semantic alignment using BERT-based embedding similarity, QA-based factuality verification through question-answer pair generation and validation, and Natural Language Inference (NLI) entailment assessment using pre-trained models such as DeBERTa-MNLI. These signals are aggregated into a unified Sentence Risk Score (SRS) via weighted calibration. Experimental evaluation on CNN/DailyMail and XSum datasets demonstrates that the proposed method achieves precision of 0.85, recall of 0.75, F1-score of 0.80, and correlation with human judgments of 0.85, representing substantial improvements over existing approaches including FactCC, QAGS, and SummaC. The proposed framework enables AI systems to flag risky sentences for review or regeneration, thereby improving trust and safety in generative applications. 2026 IEEE.
Source: Proceedings of the 2026 International Conference on AI-Driven Smart Systems and Ubiquitous Computing, ICAUC 2026;pp.1619-1626
Date: 01-01-2026
Publisher: Institute of Electrical and Electronics Engineers Inc.
Subject: factuality evaluation; Hallucination detection; large language models; natural language inference; question answering; sentence-level risk estimation
Coverage: Gupta V., Christ University, Bengaluru, 560029, India; Chakravarti L., Vivekananda Institute of Professional Studies, School of Engineering & Technology, Technical Campus, Delhi, 110034, India; Akhtar M.M., Vivekananda Institute of Professional Studies, School of Engineering & Technology, Technical Campus, Delhi, 110034, India; Maheshwari P., Vivekananda Institute of Professional Studies, School of Engineering & Technology, Technical Campus, Delhi, 110034, India; Garg P., Kiet Group of Institutions, Ghaziabad, India; Tiwari D., Vivekananda Institute of Professional Studies, School of Engineering & Technology, Technical Campus, Delhi, 110034, India
Rights: Restricted Access; Hardcopy may be available in the library
Relation: ISBN: 979-833155851-2;
Format: online
Language: English
Type: Conference paper
Identifier: https://doi.org/10.1109/ICAUC68182.2026.11441054

https://www.scopus.com/pages/publications/105037460025?origin=resultslist

Collection

Citation

Gupta, Varuna; Chakravarti, Lakshay; Akhtar, Md Moin; Maheshwari, Prem; Garg, Puneet; Tiwari, Dimple, “A Sentence-Level Risk Estimator for Identifying Hallucinations in Generative AI,” CHRIST (Deemed To Be University) Institutional Repository, accessed June 20, 2026, https://archives.christuniversity.in/items/show/25905.

A Sentence-Level Risk Estimator for Identifying Hallucinations in Generative AI

Collection

Citation

Output Formats