Scrapers stealing your content for SEO

December 15th, 2005


Content is king on the web. A site without content is doomed to lousy search engine rankings. Search engine spammers can’t be bothered writing good content. Especially when they can easily steal it from other web sites. How do they do it? They use “scrapers” — spiders that trawl web pages and/or RSS feeds and siphon off the content. They then stick your content on their own site and slap their own ads and affiliate links onto it.

The spammers especially want you to use relative links across your web site. That way they can lift your entire website and they don’t even have to go to the trouble of rejigging your internal links to make them point back to the scraped site. Granted, as far as bandwidth conservation, relative links are better than absolute links (also known as “hard links”). But let’s not make the spammer’s job any easier.

So use absolute links throughout your site.

As a side benefit, if your site responds to multiple domains and you use absolute links, you’ll also be helping the search engines reduce the potential for duplicate content by definitively identifying the full, canonical URL.

Also, to check if your site has been scraped, use Copyscape.