Browse Items (14421 total)

Sort by:

Conference Paper

Multilevel Security and Dual OTP System for Online Transaction Against Attacks

In the current internet technology, most of the transactions to banking system are effective through online transaction. Predominantly all these e-transactions are done through e-commerce web sites with the help of credit/debit cards, net banking and lot of other payable apps. So, every online transaction is prone to vulnerable attacks by the fraudulent websites and intruders in the network. As there are many security measures incorporated against security vulnerabilities, network thieves are smart enough to retrieve the passwords and break other security mechanisms. At present situation of digital world, we need to design a secured online transaction system for banking using multilevel encryption of blowfish and AES algorithms incorporated with dual OTP technique. The performance of the proposed methodology is analyzed with respect to number of bytes encrypted per unit time and we conclude that the multilevel encryption provides better security system with faster encryption standards than the ones that are currently in use. 2019 IEEE.
Conference Paper

Multilingual Sentiment Analysis of YouTube Live Stream using Machine Translation and Transformer in NLP

YouTube has become one of the all-inclusive video streaming sources on the internet. Today, the news is streamed on YouTube, marketing of a product is done live on YouTube and it has become a platform for one of the biggest PR producers for companies. Various companies have proposed an optimized way of understanding and getting the opinions of the viewers from YouTube live chat and find the best possible way to provide relevant and informative content to boost the business strategy. This study uses Natural Language Processing (NLP) based approach along with NLP transformers to classify and analyses the sentiment. 2022 IEEE.
Conference Paper

Multilingual Sentiment Analytics for India's NEP 2020

This study presents a multilingual sentiment analysis framework for evaluating public sentiments on India's National Education Policy (NEP) 2020. The authors developed a dataset related to NEP 2020 using web scraping from open sources. The curated dataset comprises 50,000 social media posts (English: 30,000, Hindi: 12,000, Tamil: 8,000) processed through a confidence-gated hybrid annotation pipeline. Sentiment labels were created using Transformer models (BERT, mBERT, XLMR) and validated by native-speaker with F1-scores of 87.6%, 81.2% and 78.0% for English, Hindi and Tamil respectively: outperforming baselines (SVM, Naive Bayes, BiLSTM) by 12-18% (p<0.001). We use computational efficiency measures to illustrate that training takes 3.2-5.3 hours and inference lasts between 118 and 187 posts per second. Topic modeling revealed sentiment divergences: positive for linguistic inclusivity and teacher training, negative for affordability and infrastructure. Cross-linguistic analysis showed English-Hindi convergence (similarity: 0.61) versus Tamil divergence (0.46), reflecting regional priorities. Tamil emphasized linguistic identity while English prioritized implementation critiques. Quantitative policy impact analysis shows very strong correlation (r=0.68, p<0.01) between regional sentiment scores and state adoption rates. This open-sourced contribution is filling the crucial gap of inclusive policy analytics in multilingual society informing evidence-based policy. 2025 IEEE.
Conference Paper

Multilingual Voice-Assisted for Traffic Sign Detection and Classification in Adverse Weather Conditions

In a world where millions of people are wounded in auto accidents each year due to negligence, a lack of understanding of traffic laws, and bad weather, there is an urgent need for greater road safety. This is particularly the case in India, where a disproportionately high number of traffic accidents lead to numerous fatalities. Ignoring traffic signs raises these risks and endangers not only vehicles but also passengers and pedestrians. This project addresses the significant issue of traffic sign recognition in bad weather and offers voice-based instruction in many languages to increase road safety. Using a mix of state-of-the-art technologies, including YOLOv8 for real-time sign detection and the Google Translate API, which supports NLP tasks, this research offers a full solution. The model's remarkable precision and efficacy underscore its capacity to revolutionize traffic safety and furnish a more secure and expedient driving encounter. With the world moving towards more autonomous mobility, this study is laying the groundwork for safer and more effective driving in the future. The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025.
Article

Multimedia Enhanced Teaching and Learning with Special Reference to Developing Cognitive Skills

Indian Streams Research Journal, Vol-3 (7), pp. 25-28. ISSN-2230-7850
Article

Multimodal artificial intelligence for early cancer detection via liquid biopsy, imaging, and clinical records

Tumours are diverse and multiscale, making it difficult for modern medicine to diagnose early cancer. Using structured clinical data, radiologic imaging features, and liquid samples, this research presents a multimodal AI framework for the early and reliable detection of cancer. The proposed approach surpasses single-modality approaches by integrating signals from various domains, including cancer genetic, anatomical, and physiological data. Using attention-based fusion, representation learning, and better preprocessing, we developed a prediction model that fine-tuned the weights of different modes. The results of the experiments demonstrated that it outperformed unimodal models on all datasets in terms of sensitivity, specificity, and generalisation. The framework has potential for screening purposes because of its ability to detect cancer at an early stage. Clinical confidence and interpretability were both boosted by the results of explainability tests, which revealed substantial feature contributions. The suggested multimodal framework outperformed unimodal baselines across all assessment cohorts with an AUC of 0.94, sensitivity of 0.91, and specificity of 0.88. Experimental results confirm multimodal fusion's clinically interpretable early cancer detection and precision oncology decision assistance. Copyright 2026. Published by Elsevier B.V.
Conference Paper

Multimodal Classification on PET/CT Image Fusion for Lung Cancer: A Comprehensive Survey

Medical image fusion has become essential for accurate diagnosis. For example, a lung cancer diagnosis is currently conducted with the help of multimodality image fusion to find anatomical and functional information about the tumor and metabolic measurements to identify the lung cancer stage and metastatic information of the disease. Generally, the success of multimodality imaging for lung cancer diagnosis is due to the combination of PET and CT imaging advantages while minimizing their respective limitations. However, medical image fusion involves the registration of two different modalities, which is time-consuming and technically challenging, and it is a cause of concern in a clinical setting. Therefore, the paper's main objective is to identify the most efficient medical image fusion techniques and the recent advances by conducting a collective survey. In addition, the study delves into the impact of deep learning techniques for image fusion and their effectiveness in automating the image fusion procedure with better image quality while preserving essential clinical information. The Electrochemical Society
Book Chapter

Multimodal data analytics for climate and water resources management

The incorporation of multimodal data analytics into climate and water resource management has become a groundbreaking strategy for tackling intricate environmental issues. This chapter examines the importance of integrating various data sourcesincluding satellite imagery, weather sensors, textual reports, and social media feedsto develop a comprehensive perspective on climate and water systems. It addresses key challenges such as data heterogeneity, computational demands, and potential biases while showcasing the significant benefits of multimodal data in enhancing predictive modeling and decision-making. The discussion extends to advanced methodologies for data acquisition, integration, and feature extraction, with a focus on machine learning and deep learning techniques. Additionally, real-world applications in climate prediction, drought and flood forecasting, and water quality assessment are explored. The chapter also considers ethical concerns and future advancements in multimodal analytics, emphasizing the importance of responsible data utilization and innovative research to strengthen climate adaptation and water resource management efforts. 2026 Elsevier Inc. All rights reserved.
Book Chapter

Multimodal data analytics for social media and user behavior

The introduction of social media has prompted an explosion of diverse data types, such as textual content, pix, videos, and audio. Traditional unimodal analysis techniques do not effectively depict the difficult interactions between exceptional fact sorts and consumer sports. Multimodal data analytics addresses this difficulty through fusing one-of-a-kind modalities to unlock deeper insights, improving the accuracy and scope of social media analysis. This bankruptcy looks into the significance of multimodal facts in understanding consumer behavior, sentiment analysis, content material engagement assessment, and trend prediction. The bankruptcy starts off with the exploration of various sources of statistics in social media analytics, together with textual content posts, visual content, and consumer interactions. It then explores preprocessing and function extraction strategies utilized to prepare raw multimodal data for the usage of gadget gaining knowledge of. In-intensity methodologies, inclusive of natural language processing for text evaluation, computer vision for photo and video interpretation, and speech recognition for audio processing, are expounded in extraordinary detail. Integration of these modalities via fusion techniquesearly fusion, past due fusion, and hybrid modelsis also explored. 2026 Elsevier Inc. All rights reserved.
Book Chapter

Multimodal data generation and synthesis

Multimodal data generation and synthesis have become new promising directions in artificial intelligence research, making possible the combination and transformation of the different data modalities: text, images, audio, and video. In this chapter a look will be made about the principles, methodologies, applications, and challenges linked with multimodal data, bringing attention to the current trends and needs regarding multimodal systems and systems approaches to tackle complex real-world challenges across the medical and health care, autonomous systems, entertainment, and extended reality (XR) fields. The chapter introduces multimodal data and discusses how the approach differs from unimodal methods, considering the merits of working with multiple data forms. Multimodal systems present richer and more comprehensive representations that lead to better decision-making and provide a better interaction with users. The complexity due to alignment, synchronization, and representation of diverse modes is inherently difficult. This section further discusses state-of-the-art techniques in multimodal synthesis, especially focusing on generative approaches like generative adversarial networks (GANs), variational autoencoders (VAEs), and diffusion models. These methods are shown to facilitate cross-modal transformations, such as text-to-image or audio-to-video synthesis, driving innovation in artificial intelligence and beyond. Applications of multimodal data synthesis are discussed in detail, underscoring its transformative impact. In health care, for instance, synthesizing medical images paired with textual annotations enhances diagnostic accuracy and medical training. Autonomous vehicles benefit from the integration of LiDAR, visual, and auditory data, enabling robust decision-making in real-time environments. Similarly, in entertainment and XR, multimodal synthesis is redefining content creation, making immersive experiences more personalized and dynamic. The chapter also delves into novel applications such as multimodal translation, exemplified by systems that translate sign language into spoken text, fostering inclusivity and accessibility. Despite its potential, multimodal synthesis faces critical challenges, including bias in data and models, privacy concerns, and the ethical implications of creating hyperrealistic synthetic data, such as deepfakes. All these raise pressing concerns, and addressing these requires robust privacy-preserving techniques, bias-mitigation strategies, and stringent ethical guidelines. 2026 Elsevier Inc. All rights reserved.
Conference Paper

Multimodal Early Fusion Strategy Based on Deep Learning Methods for Cervical Cancer Identification

It is essential to enhance the accuracy of automatic cervical cancer diagnosis by combining multiple forms of information obtained from a patients primary examination. However, existing multimodal systems are not very effective in detecting correlations between different types of data, leading to low sensitivity but high specificity. This study introduces a deep learning system for automatic diagnosis of cervical cancer by incorporating multiple sources of data. First, a convolutional neural network (CNN) to transform the image database to a vector that can be combined with non-image datasets is used. Subsequently, an investigation of jointly the nonlinear connections between all image and non-image data in a deep neural network is performed. Proposed deep learning-based method creates a unified system that takes advantage of both image and non-image data. It achieves an impressive 89.32% sensitivity at 91.6% specificity when diagnosing cervical intraepithelial neoplasia on a wide-ranging dataset. This result is far superior to any single-source system or prior multimodal approaches. The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024.
Conference Paper

Multimodal Emotion Recognition in HumanComputer Interaction Using MFF-CNN

The rise of technology in the digital era has amplified the importance of understanding human emotions in enhancing humancomputer interactions. Traditional interfaces, mainly focused on logical tasks, often miss the nuances of human emotion, creating a gap between human users and technology. Addressing this gap, the development of the HumanComputer Interface for emotional intelligence uses advanced algorithms and deep learning models to accurately recognize emotions from various cues like facial expressions, voice, and written text. This paper presented a significant approach for emotion detection in HCI and the challenges faced in capturing genuine emotional responses. Historically, the emphasis in HCI design was on operational tasks, neglecting emotional nuances. However, the tide is changing toward embedding emotional intelligence into these interfaces, leading to enhanced user experiences. This research introduces the MFF-CNN, a neural network model combining both textual and visual data for accurate emotion detection. Through sophisticated algorithms and the integration of advanced machine learning techniques, this paper presents a refined approach to emotion detection in HCI, supported by a comprehensive review of related works and a detailed methodology. The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024.
Conference Paper

Multimodal Emotion Recognition Using Deep Learning Techniques

Humans have the ability to perceive and depict a wide range of emotions. There are various models that can recognize seven primary emotions from facial expressions (joyful, gloomy, annoyed, dreadful, wonder, antipathy, and impartial). This can be accomplished by observing various activities such as facial muscle movements, speech, hand gestures, and so forth. Automatic emotion recognition is a significant issue that has been a hotly debated research topic in recent years. At the moment, several research people have taken a component in inheriting or extra multimodal for higher understanding. This paper indicates a method for emotion recognition that makes use of 3 modalities: facial images, audio indicators, and text detection from FER and CK+, RAVDESS, and Twitter tweets datasets, respectively. The CNN model achieved 66.67 percent on the FER-2013 dataset of labeled headshots while on the CK+ dataset, 98.4 percent accuracy was obtained. Finally, diverse fusion strategies had been approached, and each of those fusion techniques gave distinctive results. This project is a step towards the sense of interaction between human emotional aspects and the growing technology that is the future of development in today's world. 2022 IEEE.
Article

Multimodal emotional analysis through hierarchical video summarization and face tracking

The era of video data has fascinated users into creating, processing, and manipulating videos for various applications. Voluminous video data requires higher computation power and processing time. In this work, a model is developed that can precisely acquire keyframes through hierarchical summarization and use the keyframes to detect faces and assess the emotional intent of the user. The key-frames are used to detect faces using recursive Viola-Jones algorithm and an emotional analysis for the faces extracted is conducted using an underlying architecture developed based on Deep Neural Networks (DNN). This work has significantly contributed in improving the accuracy of face detection and emotional analysis in non-redundant frames. The number of frames selected after summarization was less than 30% using the local minima extraction. The recursive routine introduced for face detection reduced false positives in all the video frames to lesser than 2%. The accuracy of emotional prediction on the faces acquired through the summarized frames, on Indian faces achieved a 90%. The computational requirement scaled down to 40% due to the hierarchical summarization that removed redundant frames and recursive face detection removed false localization of faces. The proposed model intends to emphasize the importance of keyframe detection and use them for facial emotional recognition. 2021, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.
Conference Paper

Multimodal Face and Ear Recognition Using Feature Level and Score Level Fusion Approach

Recent years have seen a significant increase in attention in multimodal biometric systems for personal identification especially in unconstrained environments. This paper presents a multimodal recognition system by combining feature level fusion of ear and profile face images. Multimodal biometric systems by combining face and ear can be used in an extensive range of applications because we can capture both the biometrics in a non-intrusive manner. Local texture feature descriptor, BSIF is used to extract discriminative features from biometric templates. Feature level and score level fusion is experimented to improve the performance of the system. Experimental results on different public datasets like GTAV, FEI, etc., show that the proposed method gives better performance in recognition results than individual modality. The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024.
Book Chapter

Multimodal learning for autonomous systems and robotics

The realm of autonomous systems and robotics is experiencing a paradigm shift driven by the integration of advanced artificial intelligence (AI) techniques and multimodal learning approaches. This abstract explores the latest advancements and research topics that are propelling the field toward more intelligent, efficient, and versatile autonomous systems. Multimodal learning leverages multiple sensory inputs to enhance the perception and decision-making capabilities of autonomous systems. This involves the integration of visual, auditory, tactile, and other sensory data to form a coherent understanding of the environment. Deep learning techniques, such as multimodal neural networks and crossmodal embeddings, play a pivotal role in this integration, enabling the system to learn joint representations and improve robustness in perception under varying conditions. Computer vision remains a cornerstone of autonomous systems, with advancements in techniques such as real-time object detection, tracking, and high-resolution image synthesis through generative adversarial networks. Vision-based reinforcement learning is also gaining traction, enabling systems to learn from visual inputs and improve their decision-making processes in dynamic environments. The integration of advanced sensors, including high-resolution light detection and ranging, radio detection and ranging, and event-based cameras, enhances the capability of autonomous systems to perceive their surroundings accurately. Multisensor data fusion, using methods like Kalman and particle filters, ensures robust perception even in adverse conditions, providing a comprehensive view of the environment. Innovations in actuation and control systems are fundamental for the development of responsive and adaptive robots. Soft robotics, inspired by biological systems, offers new possibilities in design, modeling, and control. Hybrid control systems facilitate the coordination of multimodal actuation, enhancing the robots versatility and performance. The deployment of high-performance embedded systems, incorporating heterogeneous computing architectures (CPU-GPU-FPGA integration), is vital for real-time data processing and decision-making. Neuromorphic computing and AI hardware accelerators provide low-power solutions that are crucial for the efficiency of autonomous systems. Techniques for uncertainty estimation, outlier detection, and anomaly detection are essential for maintaining system reliability. Advanced robotic perception and cognition, combined with cognitive architectures for autonomous reasoning, enable systems to operate safely in complex and dynamic environments. The interface between humans and robots is evolving, with a focus on multimodal human-robot interaction. Learning from human demonstrations and ensuring safety and trust in human-robot teams are critical areas of research, promoting effective collaboration between humans and robots. Advanced simulation techniques, including high-fidelity physics-based simulations and domain randomization, are employed to test and validate autonomous systems. Virtual reality and augmented reality provide immersive environments for training and testing. Real-time simulation and hardware-in-the-loop testing ensure the robustness and reliability of autonomous systems before deployment. Ethical AI and autonomous decision-making frameworks are being developed to address these issues. Privacy-preserving machine learning techniques and cybersecurity measures are essential for protecting sensitive data and ensuring the security of autonomous systems. This comprehensive overview underscores the rapid advancements and multifaceted nature of multimodal learning and autonomous systems, heralding a new era of intelligent and adaptive robotics capable of transforming numerous industries and improving the quality of human life. 2026 Elsevier Inc. All rights reserved.
Book

Multimodal Learning Using Heterogeneous Data

Multimodal Learning Using Heterogeneous Data is a comprehensive guide to the emerging field of multimodal learning, which focuses on integrating diverse data types such as text, images, and audio within a unified framework. The book delves into the challenges and opportunities presented by multimodal data and offers insights into the foundations, techniques, and applications of this interdisciplinary approach. It is intended for researchers and practitioners interested in learning more about multimodal learning and is a valuable resource for those working on projects involving data analysis from multiple modalities. The book begins with a comprehensive introduction, focusing on multimodal learning's foundational principles and the intricacies of heterogeneous data. It then delves into feature extraction, fusion techniques, and deep learning architectures tailored for multimodal data. It also covers transfer learning, pre-processing challenges, and cross-modal information retrieval. The book highlights the application of multimodal learning in specialized contexts such as sentiment analysis, data generation, medical imaging, and ethical considerations. Real-world case studies are woven into the narrative, illuminating the applications of multimodal learning in diverse domains such as natural language processing, multimedia content analysis, autonomous systems, and cognitive computing. The book concludes with an insightful exploration of multimodal data analytics across social media, surveillance, user behavior, and a forward-looking examination of future trends and practical implementations. As a collective resource, Multimodal Learning Using Heterogeneous Data illuminates the powerful utility of multimodal learning to elevate machine learning tasks while also highlighting the need for innovative solutions and methodologies. The book acknowledges the challenges associated with deep learning and the growing importance of ethical considerations in the collection and analysis of multimodal data. Overall, Multimodal Learning Using Heterogeneous Data provides an expansive panorama of this rapidly evolving field, its potential for future research and application, and its vital role in shaping machine learning's evolution. 2026 Elsevier Inc. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
Book Chapter

Multimodal sentiment analysis: integrating text, image, and audio

Multimodal sentiment analysis aims to integrate text, images, and audio information to provide a more comprehensive understanding of human emotions and opinions. This chapter reviews key aspects of multimodal sentiment analysis, including feature extraction techniques, fusion methods, modeling approaches, and applications. For feature extraction the chapter discusses lexical, syntactic, and semantic features for text; visual attributes and facial expressions for images; and acoustic properties for audio. Three primary fusion techniques are examined: early fusion, which combines features before classification; late fusion, which integrates outputs from unimodal models; and model-based fusion, which learns joint representations across modalities. The chapter explores traditional machine learning and deep learning modeling approaches, highlighting the effectiveness of neural architectures like CNNs and RNNs. Key application areas discussed include social media analysis, emotion recognition, intelligent transportation, and education. The chapter also outlines future research directions, such as crossmodal learning, multimodal pretraining, and explainable AI. As multimodal data increases, sentiment analysis techniques that can effectively integrate information across modalities will become increasingly crucial for understanding human emotions and opinions in diverse contexts. This review provides a comprehensive overview of current approaches and emerging trends in this rapidly evolving field. 2026 Elsevier Inc. All rights reserved.
Book Chapter

Multiobjective portfolio optimization using multilevel quantum inspired optimization algorithms: a comparative study

The study of portfolio optimization has been a significant focus for computer science and finance researchers, with frequent publication of innovative methods. Numerous works have illustrated that conventional approaches like quadratic programming struggle with nonlinear constraints. This chapter compares ant colony optimization and particle swarm intelligence optimization within classical and quantum inspired frameworks, utilizing qubits and qutrits. This study analyzes benchmark datasets from the NASDAQ, Dow Jones, and BSE spanning over a decade. A pioneering effort has been made to develop a multiobjective portfolio optimization technique through a multilevel quantum inspired optimization algorithm. The experimental results demonstrate that the quantum inspired metaheuristic technique that utilizes qutrits slightly outperforms classical and qubit based quantum inspired methods. 2026 Elsevier Inc. All rights reserved.
Conference Paper

Multiple Approaches in Retail Analytics to Augment Revenues

Knowledge is power. The retail sector has been revolutionized around the clock by the plentiful product knowledge available to customers. Today, customers can use the knowledge available online at any time to study, compare and purchase products from anywhere. Retail companies can stay ahead of shopper trends by using retail information analytics to discover and analyze online and in-store shopper patterns. A product recommender will suggest products from a wide selection that would otherwise be very difficult to locate for the customer. The algorithm would recommend various products, increase the sales of items that would otherwise be difficult to sell. Market basket analysis is a common use scenario for the search for frequent patterns, which involves analyzing the transactional data of a retail store to decide which items are bought together. To do so data from online resource has been taken, which is analyzed and several conclusions were made. 2021, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

Output Formats:

atom, dcmes-xml, json, omeka-xml, rss2