Comparison of Gradient Boosting and Extreme Boosting Ensemble Methods for Webpage Classification
- Title
- Comparison of Gradient Boosting and Extreme Boosting Ensemble Methods for Webpage Classification
- Creator
- Dutta J.; Kim Y.W.; Dominic D.
- Description
- Web page classification is an important task in various areas like web content filtering, contextual advertising and maintaining or expanding web directories etc. Machine Learning methods have been found to perform well to classify web pages, and ensemble models have been used to improve the results obtained from single classifiers. The Gradient Boosting and Extreme Boosting ensemble models are used in this work for binary classification. The dataset containing URLs of web pages have been collected manually. The comparison between the two boosting algorithms validated the improvement in accuracy and speed obtained through Extreme boosting. Extreme boosting has been found to be around ten times faster than Gradient boosting and also shows improvement in accuracy. The effect of three preprocessing techniques; lemmatization, stop words removal and regular expressions shows that these preprocessing techniques improves the accuracy of the results but not significantly. 2020 IEEE.
- Source
- Proceedings - 2020 5th International Conference on Research in Computational Intelligence and Communication Networks, ICRCICN 2020, pp. 77-82.
- Date
- 2020-01-01
- Publisher
- Institute of Electrical and Electronics Engineers Inc.
- Subject
- Extreme Gradient Boosting; Gradient Tree Boosting; Web page classification; Web scraping
- Coverage
- Dutta J., CHRIST (Deemed to Be University), Centre for Digital Innovation, Bangalore, India; Kim Y.W., CHRIST (Deemed to Be University), Centre for Digital Innovation, Bangalore, India; Dominic D., CHRIST (Deemed to Be University), Centre for Digital Innovation, Bangalore, India
- Rights
- Restricted Access
- Relation
- ISBN: 978-172818818-8
- Format
- Online
- Language
- English
- Type
- Conference paper
Collection
Citation
Dutta J.; Kim Y.W.; Dominic D., “Comparison of Gradient Boosting and Extreme Boosting Ensemble Methods for Webpage Classification,” CHRIST (Deemed To Be University) Institutional Repository, accessed February 25, 2025, https://archives.christuniversity.in/items/show/20663.