Truth Twisters: Large Language Models Beating Humans at Fake News
- Title
- Truth Twisters: Large Language Models Beating Humans at Fake News
- Creator
- Jaiswal, Saakshi; Varghese, Nisha; Sridevi, R.
- Description
- Misinformation has become a serious global problem, affecting the process of referendum and decision-making in areas such as politics, healthcare, and social movements. With the rise of advanced artificial intelligence, especially large language models (LLM), the scenario of misinformation building has changed dramatically. These models, which are known to generate coherent and human reactions, can also be used to generate reliable but false or harmful materials. This study examines the dual nature of LLM and highlights its possible misuse to create misinformation that can evade the identity mechanism. The objective of this article is to explain how LLMs can be manipulated through prompt engineering and vocabulary attacks, where adversaries use obfuscated or subtly altered language to bypass content filters and safety guidelines. Despite being fine-tuned for ethical alignment, many LLMs can still be 'jailbroken' - a process by which users modify prompts to elicit inappropriate or restricted outputs. Through a series of controlled experiments, we demonstrate sensitivity to such adverse information of state -of -the -art LLM. These findings create serious concerns about the deployment of LLM in an open-wheel environment. Although these models offer immense possibilities of innovation and productivity, their sensitivity to manipulation outlines the immediate need for strong safety measures. We conclude by discussing moral implications and proposing strategies to reduce abuse, such as better adverse training, strict deployment protocols, and continuous monitoring to balance between safety and innovation in AI. 2025 IEEE.
- Source
- 2025 9th International Conference on Computational System and Information Technology for Sustainable Solutions, CSITSS 2025;
- Date
- 01-01-2025
- Publisher
- Institute of Electrical and Electronics Engineers Inc.
- Subject
- adversarial attacks; content safety; jailbreaking; LLMs; misinformation; NLP
- Coverage
- Jaiswal S., Christ University, Department of Computer Science, Bengaluru, India; Varghese N., Christ University, Department of Computer Science, Bengaluru, India; Sridevi R., Christ University, Department of Computer Science, Bengaluru, India
- Rights
- Restricted Access; Hardcopy may be available in the library
- Relation
- ISBN: 979-833158894-6;
- Format
- online
- Language
- English
- Type
- Conference paper
Collection
Citation
Jaiswal, Saakshi; Varghese, Nisha; Sridevi, R., “Truth Twisters: Large Language Models Beating Humans at Fake News,” CHRIST (Deemed To Be University) Institutional Repository, accessed June 18, 2026, https://archives.christuniversity.in/items/show/25808.
