How does Google Search handle duplicate content?
How does Google handle duplicate content and what negative effects can it have on rankings from an SEO perspective? – Gary Taylor, Stratford Upon Avon, UK.
It’s important to realize that if we look at content on the web,around 25 or 30 percent of all the web’s content is duplicate content, i.e. there’s millions of pages for Linux, etc. Duplicate content will definitely happen, example when people quote a paragraph from a blog, then link to the blog that sort of thing. So it’s not the case that every single time there’s duplicate content, it’s spam. If Google made that assumption, the changes that end up as the result will end up hurting, rather than helping the search quality. The fact is, Google looks for duplicate content and group them together as though they were just one piece of content.
Suppose we’re trying to return a set of search results, and we got 2 pages that are identical. On a typical basis, rather than showing both pages which are duplicates, Google will just show 1 of the pages, and will filter off the other result. And if you get to the bottom of the search result, and you really want to extend the results, you can change the filtering to show every single page, and you will see the duplicated page.
Duplicate contents are usually not treated as spam, they are treated as something that Google needs to cluster appropriately, and needs to be ranked correctly. But duplicate contents still happen. With that said, if a page contains nothing but duplicate content, and done in a abusive, deceptive, malicious, manipulative way, Google reserves the right to take action on spam.
Someone on Twitter was asking: How can I do an RSS auto block to a blog site and not have it exposed to spam?
The problem is if a page is auto-generating stuff that’s coming from nothing but an RSS feed, it is not adding any value to content and it will be most likely viewed as spam. But if you’re just making a regular website and you’re worrying about if you have something you have on .com or co.uk – or you might have 2 versions of Terms and Conditions – that sort of content happens all the time on the web, it is fine. As long as the page is not trying to copy content from every city and every state in the entire country, for the most part the webpage should be fine, and there is no need to worry about it.