Synergizing Senses: Advancing Multimodal Emotion Recognition in Human-Computer Interaction with MFF-CNN
- Title
- Synergizing Senses: Advancing Multimodal Emotion Recognition in Human-Computer Interaction with MFF-CNN
- Creator
- Upreti K.; Vats P.; Malik K.; Verma R.; Divakaran P.; Gangwar D.
- Description
- Optimizing the authenticity and efficacy of interactions between humans and computers is largely dependent on emotion detection. The MFF-CNN framework is used in this work to present a unique method for multidimensional emotion identification. The MFF-CNN model is a combination of approaches that combines convolutional neural networks and multimodal fusion. It is intended to efficiently collect and integrate data from several modalities, including spoken words and human facial expressions. The first step in the suggested system's implementation is gathering a multimodal dataset with emotional labels added to it. The MFF-CNN receives input features in the form of retrieved facial landmarks and voice signal spectroscopy reconstructions. Convolutional layers are used by the model to understand hierarchies spatial and temporal structures, which improves its capacity to recognize complex emotional signals. Our experimental assessment shows that the MFF-CNN outperforms conventional unimodal emotion recognition algorithms. Improved preciseness, reliability, and adaptability across a range of emotional states are the outcomes of fusing the linguistic and face senses. Additionally, visualization methods improve the interpretability of the model and offer insights into the learnt representations. By providing a practical and understandable method for multimodal emotion identification, this study advances the field of human-computer interaction. The MFF-CNN architecture opens the door to more organic and psychologically understanding human-computer interactions by showcasing its possibilities for practical applications. The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.
- Source
- Lecture Notes in Networks and Systems, Vol-1047 LNNS, pp. 279-288.
- Date
- 2024-01-01
- Publisher
- Springer Science and Business Media Deutschland GmbH
- Subject
- Deep Learning and Emotion Classification; Facial Expressions; Human-Computer Interaction; MFF-CNN (Multimodal Fusion and Convolutional Neural Network); Multimodal Emotion Recognition; Speech Signals
- Coverage
- Upreti K., Department of Computer Science, CHRIST (Deemed to Be University), Delhi NCR, Ghaziabad, India; Vats P., Department of Computer Science and Engineering, SCSE, Manipal University Jaipur, Rajasthan, Jaipur, India; Malik K., School of Law, CHRIST (Deemed to Be University), Delhi NCR, Ghaziabad, India; Verma R., School of Business and Management, CHRIST (Deemed to Be University), Delhi NCR, Ghaziabad, India; Divakaran P., School of Business and Management, Himalayan University, Arunachal Pradesh, Itanagar, India; Gangwar D., Department of Management, Babu Banarasi Das Institute of Technology and Management, Lucknow, India
- Rights
- Restricted Access
- Relation
- ISSN: 23673370; ISBN: 978-303164835-9
- Format
- Online
- Language
- English
- Type
- Conference paper
Collection
Citation
Upreti K.; Vats P.; Malik K.; Verma R.; Divakaran P.; Gangwar D., “Synergizing Senses: Advancing Multimodal Emotion Recognition in Human-Computer Interaction with MFF-CNN,” CHRIST (Deemed To Be University) Institutional Repository, accessed February 28, 2025, https://archives.christuniversity.in/items/show/19366.