Duplicate Content

refers to content that is fully or largely identical across several URLs. For users it is the same page, but for search engines it can be different URLs.

A distinction is made between internal duplicate content (within a site, e.g. through parameters, www/non-www, /https, filters or print versions) and external duplicate content (the same text across several domains, such as manufacturer descriptions).

As a rule there is no automatic penalty. The real problem is loss of control: Google picks a canonical version itself, signals spread across multiple URLs and look interchangeable. Common symptoms are duplicate title tags and meta descriptions. Depending on the case, duplicate content is resolved via canonical tags, 301 , noindex or unique content. The robots.txt is usually unsuitable for this.