Dublin Core The Dublin Core metadata element set is common to all Omeka records, including items, files, and collections. For more information see, http://dublincore.org/documents/dces/. Title A name given to the resource Faculty Publications Conference Paper Faculty Publications- Conference Papers Dublin Core The Dublin Core metadata element set is common to all Omeka records, including items, files, and collections. For more information see, http://dublincore.org/documents/dces/. Creator An entity primarily responsible for making the resource Thomas, Aldrin P.; George, Shiju; Raj, N. Anand; Ajaz, S. Mohemmed; Shaju, Midhun; Nasim, V. Akil Title A name given to the resource Decision Flow Tracing and Word Impact Analysis in Hybrid Transformer-Conditioned Diffusion Models for Text-to-Image Generation Date A point or period of time associated with an event in the lifecycle of the resource 01-01-2026 Source A related resource from which the described resource is derived Lecture Notes in Networks and Systems;Volume;1927 LNNS;pp.163-174 Identifier An unambiguous reference to the resource within a given context <a href="https://doi.org/10.1007/978-3-032-22914-4_13" target="_blank" rel="noreferrer noopener">https://doi.org/10.1007/978-3-032-22914-4_13</a> <br /><br /><a href="https://www.scopus.com/pages/publications/105040396373?origin=resultslist" target="_blank" rel="noreferrer noopener">https://www.scopus.com/pages/publications/105040396373?origin=resultslist</a> Coverage The spatial or temporal topic of the resource, the spatial applicability of the resource, or the jurisdiction under which the resource is relevant Thomas A.P., AI and Data Science Engineering, Christ University, Karnataka, Bangalore, India; George S., AI and Data Science Engineering, Christ University, Karnataka, Bangalore, India; Raj N.A., AI and Data Science Engineering, Christ University, Karnataka, Bangalore, India; Ajaz S.M., AI and Data Science Engineering, Christ University, Karnataka, Bangalore, India; Shaju M., AI and Data Science Engineering, Christ University, Karnataka, Bangalore, India; Nasim V.A., AI and Data Science Engineering, Christ University, Karnataka, Bangalore, India Description An account of the resource Text-to-image diffusion models have become a cornerstone of modern generative AI, offering high-quality synthesis yet remaining constrained by their black-box nature, which limits controllability and interpretability. In this work, we propose a hybrid transformer-conditioned diffusion model that integrates UNet-based denoising with multi-head cross-attention transformer blocks at critical latent stages of the diffusion process. The architecture is trained on a curated set of 50,000 samples from DiffusionDB with a 200-step latent diffusion schedule. Text prompts are encoded using a 16-token BERT encoder and mapped into a 256-dimensional latent feature space. Cross-attention layers with eight heads are interlaced within the UNet bottleneck and decoder, enabling token-to-region correspondence and fine-grained semantic propagation. To ensure interpretability, we design an explainability framework that combines hierarchical token-level attention heat maps, temporal attention rollouts, and perceptual ablation studies based on learned image patch similarity. Analysis reveals that object tokens remain spatially and temporally consistent, while attribute tokens demonstrate sharper temporal volatility. JensenShannon divergence quantifies this redistribution of attention across diffusion steps. Experimental evaluation against a standard UNet diffusion baseline demonstrates clear improvements: Frhet Inception Distance decreases by 19.6, CLIP alignment score increases by 5.4, and Inception Score improves by 18.6. Moreover, attention coherence improves by 22%, underscoring the gains in explainability. The proposed framework establishes a pathway toward accountable, high-fidelity, and interpretable text-to-image synthesis. Beyond performance, it supports critical tasks such as bias evaluation, fairness auditing, and quality assurance, offering a robust foundation for the next generation of explainable generative AI systems. The Author(s), under exclusive license to Springer Nature Switzerland AG 2026. Subject The topic of the resource Cross-Attention; DiffusionDB; Hybrid Transformer Diffusion; Interpretable Generative Modeling; Prompt Engineering; Semantic Propagation Publisher An entity responsible for making the resource available Springer Science and Business Media Deutschland GmbH Relation A related resource ISSN: 23673370; ISBN: 978-303222913-7; Language A language of the resource English Type The nature or genre of the resource Conference paper Rights Information about rights held in and over the resource Restricted Access; Hardcopy may be available in the library Format The file format, physical medium, or dimensions of the resource online