浏览全部资源
扫码关注微信
1.复旦大学计算机科学技术学院,上海 200438
2.复旦大学智能复杂体系基础理论与关键技术实验室,上海 200438
[ "汤家伟(1997- ),男,复旦大学计算机科学技术学院科研助理,主要研究方向为社交网络、机器学习、用户行为大数据分析。" ]
[ "刘育杉(1999- ),女,复旦大学计算机科学技术学院硕士生,主要研究方向为在线社交网络用户数据挖掘。" ]
[ "高敏(1997- ),女,复旦大学计算机科学技术学院博士生,主要研究方向为社交网络用户行为建模、图深度学习、网络科学等。" ]
[ "宫庆媛(1991- ),女,博士,复旦大学智能复杂体系基础理论与关键技术实验室青年副研究员,主要研究方向为在线社交网络用户行为大数据。" ]
[ "王新(1973- ),男,博士,复旦大学计算机科学技术学院党委书记、教授、博士生导师,主要研究方向为新一代互联网体系结构、无线与移动网络、数据中心网络、社交网络、网络编码的应用。" ]
[ "陈阳(1981- ),男,博士,复旦大学计算机科学技术学院教授、博士生导师,上海市智能信息处理重点实验室副主任,主要研究方向为社会计算、计算机网络、大规模用户行为数据挖掘等。" ]
收稿日期:2024-08-07,
修回日期:2024-09-29,
纸质出版日期:2024-12-15
移动端阅览
汤家伟,刘育杉,高敏等.Cerberus:基于深度学习的跨网站社交机器人检测系统[J].智能科学与技术学报,2024,06(04):482-494.
TANG Jiawei,LIU Yushan,GAO Min,et al.Cerberus: cross-site social bot detection system based on deep learning[J].Chinese Journal of Intelligent Science and Technology,2024,06(04):482-494.
汤家伟,刘育杉,高敏等.Cerberus:基于深度学习的跨网站社交机器人检测系统[J].智能科学与技术学报,2024,06(04):482-494. DOI: 10.11959/j.issn.2096-6652.202436.
TANG Jiawei,LIU Yushan,GAO Min,et al.Cerberus: cross-site social bot detection system based on deep learning[J].Chinese Journal of Intelligent Science and Technology,2024,06(04):482-494. DOI: 10.11959/j.issn.2096-6652.202436.
社交网站吸引了数十亿用户,影响着人们的生活方式。社交网站作为开放平台,注册加入的门槛较低,社交机器人能够轻易地注册,并进行舆论导向控制、不实信息传播等有害活动,以谋取利益。单一社交网站的机器人检测系统往往需要依赖用户的历史行为数据进行分析。因此,社交机器人在被识别出之前往往已经成功实施了恶意攻击。为尽早地识别出社交机器人,提出了跨网站社交机器人检测系统Cerberus。Cerberus可以解决用户早期在单个社交网站上数据不充足导致的用户识别“冷启动”的问题。Cerberus使用用户在Medium网站上的个人信息和历史活动信息,对用户链接在Twitter上的账号是否为社交机器人账号进行预测。结果表明,该系统的AUC值可达0.7552,具有良好的识别准确性。
Social networking sites have attracted billions of users and influence people's lifestyles. However
as open platform with low requirements for registration and joining
it is inevitable that social bots are able to easily register and do harmful things such as controlling public opinions and spreading inaccurate information for profit. Nevertheless
single-site social bot detection systems often rely on historical behavioral data to identify bots
and the detection occurred after the social bots have implemented their attacks. To identify social bots as early as possible
this paper proposed Cerberus
a cross-site social bot detection system. Cerberus can solve the cold-start problem of user identification caused by insufficient user data on a single platform at an early stage. Cerberus used personal information and historical activity on the Medium website of users to make prediction about whether a user's account on Twitter was a social bot. The results from our experiments show that the AUC score of Cerberus can reach 0.7522
which has good recognition accuracy.
WILSON C, BOE B, SALA A, et al. User interactions in social networks and their implications[C]//Proceedings of the 4th ACM European Conference on Computer Systems. New York: ACM, 2009: 205-218.
GABIELKOV M, RAO A, LEGOUT A. Studying social networks atscale: macroscopic anatomy of the twitter social graph[EB]. arXiv preprint,2014, arXiv: 1404.1355.
陈苑文, 王晓, 李灵犀, 等. 基于社会媒体数据增强的交通态势感知研究及进展[J]. 智能科学与技术学报, 2022, 4(1): 1-13.
CHEN Y W, WANG X, LI L X, et al. Traffic situational awareness research and development enhanced by social media data: the state of the art and prospects[J]. Chinese Journal of Intelligent Science and Technology, 2022, 4(1): 1-13.
孟雪, 杨若楠, 李睿琪. 中国现实题材电视剧的海外传播效果研究: 以YouTube平台为例[J]. 智能科学与技术学报, 2023, 5(3): 343-351.
MENG X, YANG R N, LI R Q. Research on the overseas communication effect of Chinese realistic theme TV series: take YouTube platform as an example[J]. Chinese Journal of Intelligent Science and Technology, 2023, 5(3): 343-351.
陈妍, 罗雪琴, 梁伟, 等. 基于情感信息融合注意力机制的抑郁症识别[J]. 智能科学与技术学报, 2022, 4(4): 600-609.
CHEN Y, LUO X Q, LIANG W, et al. Depression recognition based on emotional information fused with attentional mechanism[J]. Chinese Journal of Intelligent Science and Technology, 2022, 4(4): 600-609.
ZHENG H Z, XUE M H, LU H, et al. Smoke screener or straight shooter: detecting elite sybil attacks in user-review social networks[C]//Proceedings of the 2018 Network and Distributed System Security Symposium. Reston, VA: Internet Society, 2018.
COSTA H, BENEVENUTO F, MERSCHMANN L H C. Detecting tip Spam in location-based social networks[C]//Proceedings of the 28th Annual ACM Symposium on Applied Computing. New York: ACM, 2013: 724-729.
LI H Y, CHEN Z Y, MUKHERJEE A, et al. Analyzing and detecting opinion Spam on a large-scale dataset via temporal and spatial patterns[C]//Proceedings of the International AAAI Conference on Web and Social Media. Buffalo: AAAI Press,2021, 9(1): 634-637.
RAHMAN M, CARBUNAR B, BALLESTEROS J, et al. Turning the tide: curbing deceptive Yelp behaviors[C]//Proceedings of the 2014 SIAM International Conference on Data Mining. Philadelphia: Society for Industrial and Applied Mathematics, 2014: 244-252.
VENKATADRI G, GOGA O, ZHONG C T, et al. Strengthening weak identities through inter-domain trust transfer[C]//Proceedings of the 25th International Conference on World Wide Web. Switzerland: International World Wide Web Conferences Steering Committee, 2016: 1249-1259.
DE MEO P, FERRARA E, ABEL F, et al. Analyzing user behavior across social sharing environments[J]. ACM Transactions on Intelligent Systems and Technology, 2013, 5(1): 1-31.
FAZIL M, SAH A K, ABULAISH M. DeepSBD: a deep neural network model with attention mechanism for SocialBot detection[J]. IEEE Transactions on Information Forensics and Security, 2021, 16: 4211-4223.
张艳梅, 黄莹莹, 甘世杰, 等. 基于贝叶斯模型的微博网络水军识别算法研究[J]. 通信学报, 2017, 38(1): 44-53.
ZHANG Y M, HUANG Y Y, GAN S J, et al. Weibo spammers' identification algorithm based on Bayesian model[J]. Journal on Communications, 2017, 38(1): 44-53.
夏崇欢, 李华康, 孙国梓. 基于行为特征分析的微博恶意用户识别[J]. 计算机科学, 2018, 45(12): 111-116.
XIA C H, LI H K, SUN G Z. Microblogging malicious user identification based on behavior characteristic analysis[J]. Computer Science, 2018, 45(12): 111-116.
ARIN E, KUTLU M. Deep learning based social bot detection on twitter[J]. IEEE Transactions on Information Forensics and Security, 2023, 18: 1763-1772.
LIU Y H, TAN Z X, WANG H, et al. BotMoE: Twitter bot detection with community-aware mixtures of modal-specific experts[C]//Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 2023: 485-495.
程晓涛, 刘彩霞, 刘树新. 基于关系图特征的微博水军发现方法[J]. 自动化学报, 2015, 41(9): 1533-1541.
CHENG X T, LIU C X, LIU S X. Graph-based features for identifying spammers in microblog networks[J]. Acta Automatica Sinica, 2015, 41(9): 1533-1541.
LI S D, ZHAO C Y, LI Q, et al. BotFinder: a novel framework for social bots detection in online social networks based on graph embedding and community detection[J]. World Wide Web, 2023, 26(4): 1793-1809.
LIU F, YANG C F, LI Z Y, et al. Accou2vec: a social bot detection model based on community walk[J]. IEEE Transactions on Dependable and Secure Computing, 2023: 1-17.
MENDOZA M, TESCONI M, CRESCI S. Bots in social and interaction networks: detection and impact estimation[J]. ACM Transactions on Information Systems (TOIS), 2020, 39(1): 1-32.
FENG S B, TAN Z X, LI R, et al. Heterogeneity-aware Twitter bot detection with relational graph transformers[C]//Proceedings of the AAAI Conference on Artificial Intelligence. Vancouver: AAAI Press, 2022, 36(4): 3977-3985.
LEE K, CAVERLEE J, WEBB S. Uncovering social spammers: social honeypots + machine learning[C]//Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 2010: 435-442.
GONG Q Y, CHEN Y, HU J Y, et al. Understanding cross-site linking in online social networks[J]. ACM Transactions on the Web, 2018, 12(4): 1-29.
LIU S Y, WANG S H, ZHU F D, et al. HYDRA: large-scale social identity linkage via heterogeneous behavior modeling[C]//Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data. New York: ACM, 2014: 51-62.
GOGA O, LEI H, PARTHASARATHI S H K, et al. Exploiting innocuous activity for correlating users across sites[C]//Proceedings of the 22nd International Conference on World Wide Web. New York: ACM, 2013: 447-458.
SANG J T, YAN M, XU C S. Understanding dynamic cross-OSN associations for cold-start recommendation[J]. IEEE Transactions on Multimedia, 2018, 20(12): 3439-3451.
KONG X N, ZHANG J W, YU P S. Inferring anchor links across multiple heterogeneous social networks[C]//Proceedings of the 22nd ACM International Conference on Information & Knowledge Management. New York: ACM, 2013: 179-188.
GONG Q Y, CHEN Y, HE X L, et al. Cross-site prediction on social influence for cold-start users in online social networks[J]. ACM Transactions on the Web, 2021, 15(2): 1-23.
KIM T, RUENSUK M, HONG H. In helping a vulnerable bot, you help yourself: designing a social bot as a care-receiver to promote mental health and reduce stigma[C]//Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. New York: ACM, 2020: 1-13.
HE H, SU H N, XIAO W X, et al. GFI-bot: automated good first issue recommendation on GitHub[C]//Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering. New York: ACM, 2022: 1751-1755.
DAVIS C A, VAROL O, FERRARA E, et al. BotOrNot: a system to evaluate social bots[C]//Proceedings of the 25th International Conference Companion on World Wide Web - WWW '16 Companion. New York: ACM, 2016: 273-274.
SAYYADIHARIKANDEH M, VAROL O, YANG K C, et al. Detection of novel social bots by ensembles of specialized classifiers[C]//Proceedings of the 29th ACM International Conference on Information & Knowledge Management. New York: ACM, 2020: 2725-2732.
VAROL O, FERRARA E, DAVIS C, et al. Online human-bot interactions: detection, estimation, and characterization[C]//Proceedings of the International AAAI Conference on Web and Social Media, Buffalo: AAAI Press, 2017.
YANG K C, VAROL O, HUI P M, et al. Scalable and generalizable social bot detection through data selection[C]//Proceedings of the AAAI Conference on Artificial Intelligence, Vancouve: AAAI Press, 2020.
CHEN Y. Convolutional neural network for sentence classification[D]. Waterloo: University of Waterloo, 2015.
LI A, QIN Z, LIU R S, et al. Spam review detection with graph convolutional networks[C]//Proceedings of the 28th ACM International Conference on Information and Knowledge Management. New York: ACM, 2019: 2703-2711.
VASWANI A. Attention is all you need[J]. Advances in Neural Information Processing Systems, 2017.
CHEN T Q, GUESTRIN C. XGBoost: a scalable tree boosting system[C]//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2016: 785-794.
PROKHORENKOVA L, GUSEV G, VOROBEV A, et al. CatBoost: unbiased boosting with categorical features[J ] . Advances in Neural Information Processing Systems, 2018, 31 .
SVETNIK V, LIAW A, TONG C, et al. Random forest: a classification and regression tool for compound classification and QSAR modeling[J]. Journal of Chemical Information and Computer Sciences, 2003, 43(6): 1947-1958.
KE G, MENG Q, FINLEY T, et al. LightGBM: a highly efficient gradient boosting decision tree[J ] . Advances in Neural Information Processing Systems, 2017, 30 .
DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deepbidirectional transformers for language understanding[EB]. arXiv preprint,2018, arXiv: 1810.04805.
KINGMA D P, BA J, HAMMAD M M. Adam: a method for stochasticoptimization[EB]. arXiv preprint, 2014, arXiv: 1412.6980.
PEDREGOSA F, VAROQUAUX G, GRAMFORT A, et al. Scikit-learn: machine learning in python[J]. Journal of Machine Learning Research, 2011, 12: 2825-2830.
WELCH B L. On the comparison of several mean values: an alternative approach[J]. Biometrika, 1951, 38(3/4): 330-336.
FERREIRA DOS SANTOS E, CARVALHO D, RUBACK L, et al. Uncovering social media bots: a transparency-focused approach[C]//Proceedings of the Companion Proceedings of the 2019 World Wide Web Conference. New York: ACM, 2019: 545-552.
GONG Q Y, LIU Y S, ZHANG J Y, et al. Detecting malicious accounts in online developer communities using deep learning[C]//Proceedings of the IEEE Transactions on Knowledge and Data Engineering. Piscataway: IEEE Press, 2023: 10633-10649.
王明宇, 宫庆媛, 瞿晶晶, 等. 基于机器学习的GitHub企业影响力分析与预测[J]. 智能科学与技术学报, 2023, 5(3): 330-342.
WANG M Y, GONG Q Y, QU J J, et al. Analysis and prediction of GitHub company influence based on machine learning[J]. Chinese Journal of Intelligent Science and Technology, 2023, 5(3): 330-342.
0
浏览量
1
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构