Real-Time Application of Document Classification Based on Machine Learning
- Title
- Real-Time Application of Document Classification Based on Machine Learning
- Creator
- Guha A.; Samanta D.
- Description
- This research has been performed, keeping a real-time application of document (multi-page, varying length, scanned image-based) classification in mind. History of property title is captured in various documents, recorded against the said property in all the countries across the world. Information of the property, starting from ownership to the conveyance, mortgage, refinance etc. are buried under these documents. This is by far a human driven process to manage these digitized documents. Categorization of the documents is the primary step to automate the management of these documents and intelligent retrieval of information without or minimal human intervention. In this research, we have examined a popular, supervised machine learning technique called, SVM (support vector machine) with a heterogeneous data set of six categories of documents related to property. The model obtained an accuracy of 88.06% in classifying over 988 test documents. 2020, Springer Nature Switzerland AG.
- Source
- Learning and Analytics in Intelligent Systems, Vol-9, pp. 366-379.
- Date
- 2020-01-01
- Publisher
- Springer Nature
- Subject
- Balanced accuracy; Document classification; OCR; SVM; t-SNE; Text analytics; tf-idf
- Coverage
- Guha A., Eagle Labs, First American India Private LTD., Bangalore, India; Samanta D., Department of Computer Science, Christ University (Deemed to be), Bangalore, India
- Rights
- Restricted Access
- Relation
- ISSN: 26623447
- Format
- Online
- Language
- English
- Type
- Conference paper
Collection
Citation
Guha A.; Samanta D., “Real-Time Application of Document Classification Based on Machine Learning,” CHRIST (Deemed To Be University) Institutional Repository, accessed February 25, 2025, https://archives.christuniversity.in/items/show/20718.