An exploration of the impact of Feature quality versus Feature quantity on the performance of a machine learning model
- Title
- An exploration of the impact of Feature quality versus Feature quantity on the performance of a machine learning model
- Creator
- Bhayani K.; Tanna D.; Maan V.; Dhiraj; Kumar S.
- Description
- About 0.62 trillion bytes of data are generated every hour globally. These figures have been increasing as a result of digitalization and social networks. Some data ecosystems capture, store, and manage this big DATA. The basis is to be able to analyze their information and extract their value. This fact is a gold mine for companies researching and using this data. This leads us to follow how essential and valuable data is in this growing age. For any machine learning model, the selection of data is necessary. In this paper, several experiments have been performed to check the importance of data quality vs. data quantity on model performance. This clearly indicates comparing the data's richness regarding feature quality (e.g., features in images) and the amount of data for any machine learning model. Images are classified into two sets based on features, then removing redundant features from them, then training a machine learning model. Model getting trained with non-redundant data gives highest accuracy (>80%) in all cases versus the one with all features, proving the importance of feature variability and not just the feature count. 2023 IEEE.
- Source
- Proceedings of IEEE InC4 2023 - 2023 IEEE International Conference on Contemporary Computing and Communications
- Date
- 2023-01-01
- Publisher
- Institute of Electrical and Electronics Engineers Inc.
- Subject
- Deep Learning; Feature engineering; Machine Learning
- Coverage
- Bhayani K., Birla Institute of Technology and Science, Rajasthan, Pilani, India; Tanna D., Birla Institute of Technology and Science, Rajasthan, Pilani, India; Maan V., Mody University, Department of Computer Science, Rajasthan, Sikar, India; Dhiraj, CSIR-CEERI, Rajasthan, Pilani, India; Kumar S., CHRIST (Deemed to Be University), Department of Computer Science and Engineering, Bangalore, India
- Rights
- Restricted Access
- Relation
- ISBN: 979-835033577-4
- Format
- Online
- Language
- English
- Type
- Conference paper
Collection
Citation
Bhayani K.; Tanna D.; Maan V.; Dhiraj; Kumar S., “An exploration of the impact of Feature quality versus Feature quantity on the performance of a machine learning model,” CHRIST (Deemed To Be University) Institutional Repository, accessed February 25, 2025, https://archives.christuniversity.in/items/show/19868.