| Date | Place | Author | Keyword(s) |
| 2005 | AIRWeb | Gyongyi & Garcia-Molina | web spam, spam farm |
Summary
The problem of techniques designed to circumvent proper operation ranking algorithms is formalized and categorized here.
Boosting
Boosting describes the process of improving the relevance or importance of a page (or set of pages) without actually improving the quality of the content.
Term Spamming
Location-based
- Body Spam
- Title Spam
- Meta Tag Spam
- Anchor Text Spam
- URL Spam
Content-based
- Repetition
- Dumping
- Weaving
- Phrsae Stitching
Link Spamming
This describes manipulation of incoming/outgoing links to alter relevance or importance.
- inaccessible pages are pages that the spammer has no control over.
- accessible pages are where the spammer can exert some control. There are m of such resources.
- own pages are owned by the spammer. A group of owned pages is a spam farm. There are n owned pages.
- t represents the target page that the spammer would like to boost.
Outgoing Links
Example: directory cloning
Incoming Links
- honey pots
- infiltration
- social network spamming
- Link exchanges
- Recovering expired domains
- Create your own spam farm
Hiding
- Content Hiding
- Cloaking
- Redirection
Contribution
- Provides a structured framework and lexicon to facilitate the discussion on web spamming and countermeasure techniques.
Comment
- Nice, easy-to-read paper that gives a high-level overview of web spam.
Reference
--
MarcCartright - 15 Nov 2007
to top