您好,欢迎访问上海市农业科学院 机构知识库!

Spam Free Social Media User Feed Generation Using ML Technique

文献类型: 会议论文

第一作者: Anubha Sharma

作者: Anubha Sharma 1 ; Manoj Ramaiya 1 ;

作者机构: 1.Institute of Advanced Computing, SAGE University, Indore, India

关键词: Support vector machines;Social networking (online);Filtering;Reviews;Blogs;Text categorization;Artificial neural networks

会议名称: International Conference on Ubiquitous Computing and Intelligent Information Systems

主办单位:

页码: 260-269

摘要: In this paper, two types of spam filtering techniques have been discussed. (1) First type of spam filtering is based on Machine Learning (ML). The model is created based on three text feature selection techniques namely Part of Speech (PoS), Information Gain (IG) and Term Frequency-Inverse Document Frequency (TF-IDF). Additionally, these combinations were also used with Support Vector Machine (SVM), Artificial Neural Network (ANN), and Nave Bays classifiers. Then, two sets of experiments were carried out with Twitter spam (UtkML's) dataset. Based on the results, hybrid features are providing high accurate results. The PoS and TF-IDF based hybrid features with SVM provide 94.32% accuracy. Additionally, features based on PoS and IG with SVM classifier provide 94.97% accuracy. Next, both hybrid features have been used with SVM, ANN, and Nave Bays classifiers. The result shows, ANN with the feature based on POS and IG is providing highest accuracy 96.21%. (2) Second, a novel approach to deal with spam has been proposed using the content recommendation model. The proposed model is considering topic detection and user action-based topics to deliver personalized content to the user. Thus, by using content recommendation the spam filtering problem has been solved. The model utilizes three personalization variables: account following, published contents, and the user's response. Using these variables, a user's behavior is defined and identifies topics for accurate content delivery. Finally, experiments have been carried out in two groups: First, Twitter Sentiment Dataset and Amazon product review dataset is used to perform sentiment-based classification. The result shows ANN with hybrid feature of POS and IG offer 98.4% accuracy for Tweeter dataset and 97.8% accuracy for Amazon dataset results for sentiment-based text classification. On the other hand, for topic detection, among k-means and Fuzzy C-means (FCM), the FCM provides better than k-means. The FCM for UtkML's dataset provides 76.3% and for Hate Speech dataset 74.7% accuracy.

分类号: tp393-53

  • 相关文献

[1]Approaches for Identifying Suicide Ideation in Social Media Texts: Comprehensive Review. Jayshri Suresh Sonawane,Dinesh Jain. 2024

[2]Gene duplication, transfer, and evolution in the chloroplast genome. Xiong, Ai-Sheng,Peng, Ri-He,Zhuang, Jing,Gao, Feng,Zhu, Bo,Fu, Xiao-Yan,Xue, Yong,Jin, Xiao-Feng,Tian, Yong-Sheng,Zhao, Wei,Yao, Quan-Hong.

[3]Neural Network for Fretting Wear Modeling. Laura Haviez,Rosario Toscano,Siegfried Fourvy,Ghislain Yantio. 2014

[4]Video-Based Advertisement Value Impact on Brand Awareness and Purchase Intention in Social Media. Erwin Halim,Marylise Hebrard,Erwin Putra Tanadjaja,Hendry Hartono. 2022

[5]A Review of Smart Traffic Operation System for Traffic Control Using Internet of effects & Reinforcement Learning. Bharat Pahadiya,Rekha Ranawat. 2023

[6]Decoding the Popularity of TV Series: A Network Analysis Perspective. Melody Yu,Yu Sun. 2024

[7]Revolutionizing Healthcare with Federated Learning: A Comprehensive Review. Snehlata Mishra,Ritu Tondon,Narendra Pal Singh Rathore. 2024

[8]A Review of Protein Sequences of COVID-19 Using Machine Learning and Deep Learning Approaches. Anurag Golwelkar,Abhay Kothari. 2023

[9]Wind Speed and Direction Prediction for Wind Farms Using Support Vector Regression. Ali Lahouar,Jaleleddine Ben Hadj Slama. 2014

作者其他论文 更多>>