Information extraction and text mining of Ancient Vattezhuthu characters in historical documents using image zoning
- Title
- Information extraction and text mining of Ancient Vattezhuthu characters in historical documents using image zoning
- Creator
- Vellingiriraj E.K.; Balamurugan M.; Balasubramanie P.
- Description
- The aim of this paper is to develop a system that involves character recognition of Brahmi, Grantha and Vattezuthu characters from palm manuscripts of historical Tamil ancient documents, analyzed the text and machine translated the present Tamil digital text format. Though many researchers have implemented various algorithms and techniques for character recognition in different languages, ancient characters conversion still poses a big challenge. Because image recognition technology has reached near-perfection when it comes to scanning English and other language text. But optical character recognition (OCR) software capable of digitizing printed Tamil text with high levels of accuracy is still elusive. Only a few people are familiar with the ancient characters and make attempts to convert them into written documents manually. The proposed system overcomes such a situation by converting all the ancient historical documents from inscriptions and palm manuscripts into Tamil digital text format. It converts the digital text format using Tamil unicode. Our algorithm comprises different stages: i) image preprocessing, ii) feature extraction, iii) character recognition and iv) digital text conversion. The first phase conversion accuracy of the Brahmi script rate of our algorithm is 91.57% using the neural network and image zoning method. The second phase of the Vattezhuthu character set is to be implemented. Conversion accuracy of Vattezhuthu is 89.75%. 2016 IEEE.
- Source
- Proceedings of the 2016 International Conference on Asian Language Processing, IALP 2016, pp. 37-40.
- Date
- 2017-01-01
- Publisher
- Institute of Electrical and Electronics Engineers Inc.
- Subject
- Character Recognition; Image Zoning; Machine Translation; Segmentation; Vattezhuthu
- Coverage
- Vellingiriraj E.K., Department of Computer Science and Engineering, Kongu Engineering College, Perundurai, Tamil Nadu, India; Balamurugan M., Department of Computer Science and Engineering, Christ University, Bangalore, Karnataka, India; Balasubramanie P., Department of Computer Science and Engineering, Kongu Engineering College, Perundurai, Tamil Nadu, India
- Rights
- Restricted Access
- Relation
- ISBN: 978-150900921-3
- Format
- Online
- Language
- English
- Type
- Conference paper
Collection
Citation
Vellingiriraj E.K.; Balamurugan M.; Balasubramanie P., “Information extraction and text mining of Ancient Vattezhuthu characters in historical documents using image zoning,” CHRIST (Deemed To Be University) Institutional Repository, accessed February 24, 2025, https://archives.christuniversity.in/items/show/20947.