YANG Wang, JIANG Yong-han, ZHANG San-feng. A Web Spam Link Detection Method Based on Web Page Structure and Text Features[J]. Journal of Northeastern University Natural Science, 2020, 41(8): 1091-1096.
[1]中国互联网信息中心(CNNIC).第42次中国互联网络发展状况统计报告[EB/OL].[2019-08-19]. http://www.cnnic.net.cn/hlwfzyj/hlwxzbg/hlwtjbg/201808/P020180820630889299840.pdf.(China Internet Network Information Center.The 42nd statistical report on China’s Internet development[EB/OL].[2019-08-19].http://www.cnnic.net.cn/hlwfzyj/hlwxzbg/hlwtjbg/201808/P020180820630889299840.pdf.) [2]Jansen B J,Spink A.An analysis of web documents retrieved and viewed[C]//The 4th International Conference on Internet Computing.Las Vegas,2003:65-69. [3]杨向军.Web spam检测系统的设计和实现[D].广州:华南理工大学,2010.(Yang Xiang-jun.Design and implementation of web spam detection system[D].Guangzhou:South China University of Technology,2010.) [4]da Costa Carvalho A L,Chirita P A,de Moura E S,et al.Site level noise removal for search engines[C]//Proceedings of the 15th International Conference on World Wide Web.Edinburgh,Scotland,2006 :73-82. [5]Malaga R A.Search engine optimization—black and white hat approaches[J].Advances in Computers,2010,78:1-39. [6]Google Inc.Google Panda[EB/OL].[2019-07-15].https://baike.baidu.com/item/%E7%86%8A%E7%8C%AB%E7%AE%97%E6%B3%95. [7]Baidu Inc.Baidu Luluo algorithm[EB/OL].[2019-07-18].https://baike.baidu.com/item/%E7%99%BE%E5%BA%A6%E7%BB%BF%E8%90%9D%E7%AE%97%E6%B3%95/6023432?fromtitle=%E7%BB%BF%E8%90%9D%E7%AE%97%E6%B3%95&fromid=5994878&fr=aladdin. [8]周文怡,顾徐波,施勇,等.基于机器学习的网页暗链检测方法[J].计算机工程,2018,44(10):22-27.(Zhou Wen-yi,Gu Xu-bo,Shi Yong,et al.Detection method for hidden hyperlink based on machine learning[J].Computer Engineering,2018,44(10):22-27.) [9]Gyngyi Z,Garcia-Molina H.Web spam taxonomy[C/OL].Proceedings of the First International Workshop on Adversarial Information Retrieval on the Web[2019-08-05].http://airweb.cse.lehigh.edu/2005/gyongyi.pdf. [10]Ntoulas A,Najork M,Manasse M,et al.Detecting spam web pages through content analysis[C]//Proceedings of the 15th International Conference on World Wide Web. Edinburgh,Scotland,2006 :83-92. [11]Fetterly D,Manasse M,Najork M.Spam,damn spam,and statistics:using statistical analysis to locate spam web pages[C]//Proceedings of the 7th International Workshop on the Web and Databases.Paris,2004:1-6. [12]Gyngyi Z,Garcia-Molina H,Pedersen J.Combating web spam with trustrank[C]//Proceedings of the 30th International VLDB Conference.New York:ACM Press,2004:576-587. [13]Gyngyi Z,Berkhin P,Garcia-Molina H,et al.Link spam detection based on mass estimation[C]//Proceedings of the 32nd International Conference on Very Large Data Bases.[S.l.]:VLDB Endowment,2006:439-450. [14]Wu B,Davison B D.Cloaking and redirection:a preliminary study[J/OL].[2019-08-16].https://www.researchgate.net/publication/303137682_Cloaking_and_Redirection_A_Preliminary_Study. [15]Sun J Y.jieba[EB/OL].[2019-07-28].https://pypi.org/project/jieba/. [15]关守平,房少纯.一种新型的区间-粒子群优化算法[J].东北大学学报(自然科学版),2012,33(10):1381-1384.(Guan Shou-ping,Fang Shao-chun.A new interval particle swarm optimization algorithm[J].Journal of Northeastern University(Natural Science),2012,33(10):1381-1384.)