Privacy Risk Prediction from Social Media Metadata using Feature Selection Approaches
- Title
- Privacy Risk Prediction from Social Media Metadata using Feature Selection Approaches
- Creator
- Pranav, R.; Rajoriya, Meenakshi Malviya; Rohini, V.
- Description
- Millions of new people sign up to online social networks (OSNs) every year, which contributes to the growing spread of Personally Identifiable Information. This often ends up occurring unconsciously, either due to the low stakes involved or because the user doesn't understand or underestimates what can go wrong. This trend indicates the need for a trustworthy means to quantify the privacy danger of sharing information online. The volume of OSN data can simply be too staggering for any degree of meaningful manual review, given both the time and man-hours this would entail. This research presents a two-step, unsupervised, and efficient method to estimate privacy risks at the post level. The first step involves using the most advanced reasoning-based Large Language Model, Gemini 2.5 Pro, to generate a comprehensive 'vulnerability score', which is used as a reference for model training. The next step involves comparing the two most used machine learning feature selection techniques, Recursive Feature Elimination (RFE) and Correlation-Based Selection, to select the best features for predicting this score from metadata alone. The results indicate that Correlation-Based Selection produces better results for both the regression and classification-based models, and the top-performing regression model achieves an R-squared of 0.86. Through this, a practical and scalable method to identify privacy-sensitive content effectively on large datasets has been presented in this study. 2025 IEEE.
- Source
- International Conference on NexGen Networks and Cybernetics, IC2NC 2025 - Proceedings;pp.61-66
- Date
- 01-01-2025
- Publisher
- Institute of Electrical and Electronics Engineers Inc.
- Subject
- Feature Selection; Large Language Models; Online Social Networks; Privacy Risk; PrivacySensitive Content
- Coverage
- Pranav R., Christ University, Department of Computer Science, Bengaluru, India; Rajoriya M.M., Christ University, Department of Computer Science, Bengaluru, India; Rohini V., Christ University, Department of Computer Science, Bengaluru, India
- Rights
- Restricted Access; Hardcopy may be available in the library
- Relation
- ISBN: 979-833159484-8;
- Format
- online
- Language
- English
- Type
- Conference paper
Collection
Citation
Pranav, R.; Rajoriya, Meenakshi Malviya; Rohini, V., “Privacy Risk Prediction from Social Media Metadata using Feature Selection Approaches,” CHRIST (Deemed To Be University) Institutional Repository, accessed June 18, 2026, https://archives.christuniversity.in/items/show/25863.
