Get all the updates for this publication
This paper describes a link-based technique for automating the detection of Web spam, that is, pages using deceptive techniques for obtaining an undeservedly high score in search engines. The problem of Web spam is widespread and difficult to solve, mostly due to the large size of the Web that makes many algorithms infeasible in practice. We propose spam detection techniques that only consider the link structure of Web, regardless of page contents. In particular, we compute statistics of the links in the vicinity of every Web page applying rank propagation and probabilistic counting over the Web graph and C5.0 classification algorithm. These statistical features are used to build a classifier that is tested over a large collection of Web link spam.
View more info for "Effective Site Finding Using Qualified Link Information"
Journal | Data powered by TypesetInternational Conference on Advances in Information Technology and Mobile … |
---|---|
Publisher | Data powered by TypesetSpringer |
ISSN | 978-3-642-35863-0 |
Open Access | No |