Enhancing environmental sound classification with weighted attention-based spectrogram fusion and overlapping pre-patching
- Title
- Enhancing environmental sound classification with weighted attention-based spectrogram fusion and overlapping pre-patching
- Creator
- Presannakumar, Krishna; Mohamed, Anuj
- Description
- Environmental Sound Classification (ESC) remains challenging due to the diverse and overlapping acoustic characteristics of real-world environments. Traditional models relying on single-feature representations such as Mel spectrograms often fail to capture the full range of spectral and temporal details. This paper introduces a novel algorithm Weighted Attention-based Spectrogram Fusion (WASF) that adaptively integrates Mel spectrograms, Cochleograms, and Correlograms using a hierarchical attention mechanism across channel, temporal, and frequency dimensions. Compared to traditional fusion techniques, WASF uses a learnable attention mechanism to dynamically weight each feature's importance over time and frequency, improving the model's capacity to focus on important acoustic cues. In addition, an overlapping pre-patching strategy is proposed to preserve local temporal continuity, enhancing transformer-based modeling. Proposed model demonstrates superior performance with 95.71 % accuracy on UrbanSound8K, 93.97 % on ESC-50, and 94.91 % on ESC-10 datasets. Extensive ablation studies and interpretability analysis validate the effectiveness of each component, demonstrating robustness across diverse acoustic environments and noise conditions. The computational efficiency and interpretable attention patterns make our approach suitable for real-time deployment in smart city applications, surveillance systems, and assistive technologies. 2025 Elsevier B.V.
- Source
- Applied Soft Computing;Volume;186;Issue;;Article No.;114192;
- Date
- 01-01-2026
- Publisher
- Elsevier Ltd
- Subject
- Audio sound signal; Deep learning; Environmental sound classification; Feature representation; Signal processing; Transformers
- Coverage
- Presannakumar K., Department of Computer Science, School of Sciences, CHRIST (Deemed to be University), Karnataka, Bangalore, 560029, India; Mohamed A., School of Computer Sciences, Mahatma Gandhi University, Kerala, Kottayam, 686560, India
- Rights
- Restricted Access; Hardcopy may be available in the library
- Relation
- ISSN: 15684946;
- Format
- online
- Language
- English
- Type
- Article
Collection
Citation
Presannakumar, Krishna; Mohamed, Anuj, “Enhancing environmental sound classification with weighted attention-based spectrogram fusion and overlapping pre-patching,” CHRIST (Deemed To Be University) Institutional Repository, accessed June 19, 2026, https://archives.christuniversity.in/items/show/22199.
