Study of State-of-the-Art Performance Metrics in NLP: Specifically for Text Summarization in the Medical Domain Using the SumPubMed Dataset
- Title
- Study of State-of-the-Art Performance Metrics in NLP: Specifically for Text Summarization in the Medical Domain Using the SumPubMed Dataset
- Creator
- Sridhar, M.; Irfan, Mohammad; Kathirmani, Sukumar
- Description
- Text summarization is becoming very important given the number of documents produced each year across domains. In this paper we explore the various traditional metrics for text summarization, such as ROUGE, BLEU, METEOR, etc., and look at improving the performance of the existing metric by taking the stateof-the-art untrained metric SUPERT, and clubbing it with a readability score and a penalty for long summaries. The SUMPUBMED dataset was used for this research and a BERT extractive summarizer was used for generating the summaries. It was found that using a readability score with an unsupervised metric such as SUPERT helped in assessing the quality of the summary more accurately than earlier metrics. We compared the metrics such as SUPERT scores and BERT scores with and without involving the human annotated summaries in the SUMPUBMED dataset and found that untrained metrics perform better than when involving a reference annotated summary. 2025 Scrivener Publishing LLC.
- Source
- Generative AI: Disruptive Technologies for Innovative Applications;pp.91-106
- Date
- 01-01-2025
- Publisher
- wiley
- Subject
- BERT scores; Metrics; NLP; SUPERT; Text summarization
- Coverage
- Sridhar M., NSB Academy, Business School, Bangalore, India; Irfan M., Christ University, Bangalore, India; Kathirmani S., Quelit Innovations Pvt Ltd, Chennai, India
- Rights
- Restricted Access; Hardcopy may be available in the library
- Relation
- ISBN: 978-139430293-2; 978-139430290-1;
- Format
- online
- Language
- English
- Type
- Book chapter
Collection
Citation
Sridhar, M.; Irfan, Mohammad; Kathirmani, Sukumar, “Study of State-of-the-Art Performance Metrics in NLP: Specifically for Text Summarization in the Medical Domain Using the SumPubMed Dataset,” CHRIST (Deemed To Be University) Institutional Repository, accessed June 19, 2026, https://archives.christuniversity.in/items/show/23930.
