A Multi-Modal Approach to Digital Document Stream Segmentation for Title Insurance Domain
- Title
- A Multi-Modal Approach to Digital Document Stream Segmentation for Title Insurance Domain
- Creator
- Guha A.; Alahmadi A.; Samanta D.; Khan M.Z.; Alahmadi A.H.
- Description
- In the twenty-first century, storing and managing digital documents has become commonplace for all corporate and public sectors around the world. Physical documents are scanned in batches and stored in a digital archive as a heterogeneous document stream, referred to as a digital package. To make Robotic Process Automation (RPA) easier, it's necessary to automatically segment the document stream into a subset of independent, coherent multi-page documents by detecting the appropriate document boundary. It's a common requirement of a TI company's Automated Document Management Systems (ADMS), where business operations are automated using RPA and the goal is to extract information from digital documents with minimal user intervention. The current study proposes, evaluates, and compares a multi-modal binary classification network incorporating text and picture aspects of digital document pages to state-of-the-art baseline methodologies. Image and textual features are extracted simultaneously from the input document image by passing them through Visual Geometry Group 16 - Convolutional Neural Network (VGG16-CNN) and pre-trained Bidirectional Encoder Representations from Transformers (Legal-BERT {}_{base} ) model through transfer learning respectively. Both features are finally fused and passed through a fully connected layer of Multi Layered Perceptron (MLP) to obtain the binary classification of the pages as the First Page (FP) and Other Page (OP). Real-time document image streams from production business process archive were obtained from a reputed Title Insurance (TI) company for the study. The obtained F_{1} score of 97.37% and 97.15% are significantly higher than the accuracies of the considered two baseline models and well above the expected Straight Through Pass (STP) threshold defined by the process admin. 2013 IEEE.
- Source
- IEEE Access, Vol-10, pp. 11341-11353.
- Date
- 2022-01-01
- Publisher
- Institute of Electrical and Electronics Engineers Inc.
- Subject
- BERT; binary classification; multi modal training; Page stream segmentation; title insurance; VGG16
- Coverage
- Guha A., Department of Data Science, CHRIST (Deemed to be University), Karnataka, Bengaluru, 560029, India, First American India Private Ltd., Karnataka, Bengaluru, 560038, India; Alahmadi A., Department of Computer Science and Information, Taibah University, Medina, 42353, Saudi Arabia; Samanta D., Department of Computer Science, CHRIST (Deemed to be University), Karnataka, Bengaluru, 560029, India; Khan M.Z., Department of Computer Science and Information, Taibah University, Medina, 42353, Saudi Arabia; Alahmadi A.H., Department of Computer Science and Information, Taibah University, Medina, 42353, Saudi Arabia
- Rights
- All Open Access; Gold Open Access
- Relation
- ISSN: 21693536
- Format
- Online
- Language
- English
- Type
- Article
Collection
Citation
Guha A.; Alahmadi A.; Samanta D.; Khan M.Z.; Alahmadi A.H., “A Multi-Modal Approach to Digital Document Stream Segmentation for Title Insurance Domain,” CHRIST (Deemed To Be University) Institutional Repository, accessed April 17, 2025, https://archives.christuniversity.in/items/show/15490.