How much of my content is duplicated on the internet? – Anyone who asks this question not only contributes a valuable part to the SEO quality of their own website, but also avoids losses in search engine rankings. When websites are indexed by search engines, duplicate content is a major hurdle and gives your website a poor ranking. Read in this article how duplicate content is created and which tips and tricks you can use to avoid duplicate content.
What does duplicate content mean?
The SEO-specific technical term describes the double occurrence of texts or content of a domain within all websites that can be reached by search engines.
In plain language this means: all identical content on the web that can be reached via different web addresses is recognized as duplicate content . Even if the content URLs are only slightly modified versions of the original content.
Incidentally, this also means that the same content is indexed by search engines as duplicate content even if it is distributed over several domains.
The latter often happens when, for example, a blog makes too much use of texts and materials from other websites without identifying these citations as such.
Why is duplicate content relevant for SEO?
Search engines strive for the perfect customer journey .
They want to deliver optimally and achieve the best results in the shortest possible time .
This background makes sense for the business model: search engines try to provide their users with content that is as unique as possible. Duplicate content contradicts this uniqueness. If a search engine, a Googlebot, does not understand the specific context of identical content, it decides for itself which of the results it will ultimately display in its search list. As a result, pages can also disappear completely from that search list.
However, if both domains remain valid, valuable ranking potential can be lost. Search engines place their own index decisions far behind SEO-optimized and properly indexed websites. The uncertainty as to whether the non-original content really fits only one request leads to this approach.
It becomes really SEO-damaging when the bot of a search engine considers duplicate content to be an attempt at fraud and then degrades the website in the search results .
Causes of internal duplicate content
A distinction is made between internal and external duplicate content: Internal DC refers to content on a domain that can be accessed via multiple URLs. This is almost always done by adding an additional word to the web address, a category designation. A supplementary assignment, so to speak.
In web shops , the phenomenon arises, for example, through the different click paths to a product.
When clicked directly, the URL ends in: .de/product1
However, selected via a category, the same page is displayed under an extended URL, such as / de/product1/category1/productpage1
However, duplicate content is created in many other, internal ways:
- Printer-friendly pages and PDF downloads generate their own URLs.
- Likewise variations of pages of a product, links, tags
- Distinguishable session IDs get their own web addresses.
- Content management systems supplement URL endings with /index.htm/ or similar.
Causes of External Duplicate Content
External duplicate content means the same content appears on different domains. In the case of personal negligence, this is almost always a mistake . For example, when a company runs its projects on different domains or publishes a message on both sides.
BUT: External duplicate content also exists through third-party debt! It is caused by plagiarism or content theft.
- Your own content suddenly and unexpectedly appears on other websites.
- The original and the copied content are indexed.
- Search algorithms do not recognize the original.
- Only an active approach helps against external duplicate content.
SEO tips to find and avoid duplicate content
Duplicate content is an irritation. It should be avoided.
Therefore, strategic SEO begins with your own website and the various usable options for strategically presenting your own products and strengths.
Arrange the best stage for your performances. A strategic overview of your website also often reveals sources of duplication in content.
Find duplicate content:
Tip 1: Basically, the Google search engine can be a good way to track down duplicate content. Excluded websites can be shown under Coverage in the Index menu . In the details you can see Google’s classification and justification.
Tip 2: Even without this tool, Google can help. Enter larger of your text modules for the search and then press this blue link in the introductory text (source Google search) when the result is displayed:
This also displays links behind which duplicate content can hide. A manual check-up for your content.
Avoid duplicate content – this is how it works:
- When moving a website, care should be taken to use correct forwarding.
- Using a canonical tag references the version of the URL that is preferred to appear on SERPs. This is useful, for example, in online shops that display the same content on many pages.
- Internal links should always be designed uniformly; for example always with a preceding intro – http://www. – in the URL.
- In many cases, a 301 redirect from the duplicate to the original page helps.
- Google recommends using the correct CC TLDs (Country Code Top Level Domain). And with webmaster tools, you can tell Google how to index domains.
- Especially for shops, individual product descriptions are useful, even for products that are very similar.
- The “noindex, follow” tag can be used to exclude unavoidable double pages from the Google search index.
- If it is not possible to set a canonical tag , linking to the original source of the content is usually an alternative.
- Definitely recommendable: Get rid of placeholder pages!
- In some cases, content is automatically published under different domains by each content management system, e.g. on the homepage, in the archive and in the content generator.
A thorough understanding of the processes of one’s own CMS will help to avoid this.
Important to know: Real SEO corresponds directly to strategic content marketing. Search engines appreciate the structured presentation of content contributions. The core or pillar of a subject area is supported by logical sub-topics. This is how logically structured knowledge databases are created.
Want more search engine optimization tips?
Then we recommend this free, compact white paper:
Duplicate content – double, identical network content on the web – poses a challenge for search engines. They neither recognize the original nor can you clearly assign the content. This harms the ranking and SEO of a domain.
You can use Google to search for duplicate content manually. The Search Console helps to check the cause of incorrect or missing indexing. And webmaster tools show index defaults when used for Google. In addition, many duplications can be avoided if you know how they occur unintentionally.
Duplicate content is no reason to panic. And although websites are theoretically unique, in reality they are often corrupted (mainly by external duplicate content).
Avoid blunders that create unnecessary duplicate content, and try to consistently create original and unique content. In this way, no search engine would ever doubt the importance of your content.