A Document Clustering Approach Using Shared Nearest Neighbour Affinity, TF-IDF and Angular Similarity
- Title
- A Document Clustering Approach Using Shared Nearest Neighbour Affinity, TF-IDF and Angular Similarity
- Creator
- Goswami M.
- Description
- Quantum of data is increasing in an exponential order. Clustering is a major task in many text mining applications. Organizing text documents automatically, extracting topics from documents, retrieval of information and information filtering are considered as the applications of clustering. This task reveals identical patterns from a collection of documents. Understanding of the documents, representation of them and categorization of documents require various techniques. Text clustering process requires both natural language processing and machine learning techniques. An unsupervised spatial pattern identification approach is proposed for text data. A new algorithm for finding coherent patterns from a huge collection of text data is proposed, which is based on the shared nearest neighbour. The implementation followed by validation confirms that the proposed algorithm can cluster the text data for the identification of coherent patterns. The results are visualized using a graph. The results show the methodology works well for different text datasets. 2021, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
- Source
- Lecture Notes on Data Engineering and Communications Technologies, Vol-57, pp. 267-276.
- Date
- 2021-01-01
- Publisher
- Springer Science and Business Media Deutschland GmbH
- Subject
- Clustering; DBSCAN; Density-based; Document classification; Similarity; SNN; Unstructured data email documents
- Coverage
- Goswami M., CHRIST (Deemed to be University), Bengaluru, India
- Rights
- Restricted Access
- Relation
- ISSN: 23674512
- Format
- Online
- Language
- English
- Type
- Book chapter
Collection
Citation
Goswami M., “A Document Clustering Approach Using Shared Nearest Neighbour Affinity, TF-IDF and Angular Similarity,” CHRIST (Deemed To Be University) Institutional Repository, accessed February 24, 2025, https://archives.christuniversity.in/items/show/18778.