Is there a HTACCESS string that stops duplicate pages?
Moderator: General Moderators
-
simonmlewis
- DevNet Master
- Posts: 4435
- Joined: Wed Oct 08, 2008 3:39 pm
- Location: United Kingdom
- Contact:
Is there a HTACCESS string that stops duplicate pages?
We have URLs that load the category pages whether there is a / on the end or not.
We are told this is bad for SEO, as it is seen as a duplicate page.
This is our HTACCESS for the categories.
RewriteRule ^categ/([^/]+) /index.php?page=categ&cname=$1 [QSA]
Is there something simple I can add to this to only allow URLs with a / on the end?
If I add a /, then the page loads but doesn't load up the information from the URL.
Just wondering if the rewriterule should be different to stop those with a / being the same as those without.
We are told this is bad for SEO, as it is seen as a duplicate page.
This is our HTACCESS for the categories.
RewriteRule ^categ/([^/]+) /index.php?page=categ&cname=$1 [QSA]
Is there something simple I can add to this to only allow URLs with a / on the end?
If I add a /, then the page loads but doesn't load up the information from the URL.
Just wondering if the rewriterule should be different to stop those with a / being the same as those without.
Love PHP. Love CSS. Love learning new tricks too.
All the best from the United Kingdom.
All the best from the United Kingdom.
Re: Is there a HTACCESS string that stops duplicate pages?
Flip side of that is that /category/t-shirts and /category/t-shirts/ going to different pages is bad for users. You could redirect one to the other, couldn't you?simonmlewis wrote:We have URLs that load the category pages whether there is a / on the end or not.
We are told this is bad for SEO, as it is seen as a duplicate page.
-
simonmlewis
- DevNet Master
- Posts: 4435
- Joined: Wed Oct 08, 2008 3:39 pm
- Location: United Kingdom
- Contact:
Re: Is there a HTACCESS string that stops duplicate pages?
Both pages are the same in the content and layout. It's just SEO wise.
But if someone posts a link to /category/t-shirts/, and it loads the same page as the one without the slash, Google will cache both pages... which is bad as it's a duplicate.
So I don't know if there is a global overriding way in our HTACCESS of controlling it.
But if someone posts a link to /category/t-shirts/, and it loads the same page as the one without the slash, Google will cache both pages... which is bad as it's a duplicate.
So I don't know if there is a global overriding way in our HTACCESS of controlling it.
Love PHP. Love CSS. Love learning new tricks too.
All the best from the United Kingdom.
All the best from the United Kingdom.
Re: Is there a HTACCESS string that stops duplicate pages?
Celauran wrote:You could redirect one to the other, couldn't you?
-
simonmlewis
- DevNet Master
- Posts: 4435
- Joined: Wed Oct 08, 2008 3:39 pm
- Location: United Kingdom
- Contact:
Re: Is there a HTACCESS string that stops duplicate pages?
http://stackoverflow.com/questions/1708 ... y-htaccess
The other issue we have is with double slashes.
IF the url has // after uk, and then just single / through the rest, it loads and doesn't remove the bad //s.
But if I put it in with // at the start after uk, and // thru the rest of the URL it does rewrite it.
So it's not quite right.
The other issue we have is with double slashes.
IF the url has // after uk, and then just single / through the rest, it loads and doesn't remove the bad //s.
But if I put it in with // at the start after uk, and // thru the rest of the URL it does rewrite it.
Code: Select all
#remove double/more slashes in url
RewriteCond %{REQUEST_METHOD} !=POST
RewriteCond %{REQUEST_URI} ^(.*?)(/{2,})(.*)$
RewriteRule . %1/%3 [R=301,L]Love PHP. Love CSS. Love learning new tricks too.
All the best from the United Kingdom.
All the best from the United Kingdom.
Re: Is there a HTACCESS string that stops duplicate pages?
. matches any character, including /. You probably want to modify that first rule to exclude /
-
simonmlewis
- DevNet Master
- Posts: 4435
- Joined: Wed Oct 08, 2008 3:39 pm
- Location: United Kingdom
- Contact:
Re: Is there a HTACCESS string that stops duplicate pages?
Code: Select all
RewriteEngine On
RewriteCond %{THE_REQUEST} \s/+(.*?)/+(/\S*) [NC]
RewriteRule ^ %1%2 [R=302,L,NE]
RewriteCond %{REQUEST_URI} ^(.*)/{2,}(.*)$
RewriteRule . %1/%2 [R=301,L]If I use only the second one, then the internal pages rewrite, but the first one doesn;t.
Love PHP. Love CSS. Love learning new tricks too.
All the best from the United Kingdom.
All the best from the United Kingdom.
-
simonmlewis
- DevNet Master
- Posts: 4435
- Joined: Wed Oct 08, 2008 3:39 pm
- Location: United Kingdom
- Contact:
Re: Is there a HTACCESS string that stops duplicate pages?
Code: Select all
# Remove multiple slashes after domain
RewriteCond %{HTTP_HOST} !=""
RewriteCond %{THE_REQUEST} ^[A-Z]+\s//+(.*)\sHTTP/[0-9.]+$ [OR]
RewriteCond %{THE_REQUEST} ^[A-Z]+\s(.*/)/+\sHTTP/[0-9.]+$
RewriteRule .* http://%{HTTP_HOST}/%1 [R=301,L]
# Remove multiple slashes anywhere in URL
RewriteCond %{REQUEST_URI} ^(.*)//(.*)$
RewriteRule . %1/%2 [R=301,L]Is the answer to try and code it so it works ONLY with a / on the end and if Google caches pages without the slash, then manual 301s or is there a htaccess rule for each of our rules to spot them?
Love PHP. Love CSS. Love learning new tricks too.
All the best from the United Kingdom.
All the best from the United Kingdom.
-
simonmlewis
- DevNet Master
- Posts: 4435
- Joined: Wed Oct 08, 2008 3:39 pm
- Location: United Kingdom
- Contact:
Re: Is there a HTACCESS string that stops duplicate pages?
We also have an issue where if at the end of a product url, you enter /hello, the page still loads. This is the product htaccess:
RewriteRule ^product/([^/]+)/([^/]+)/([0-9]+)/([^/]+) /index.php?page=product&cname=$1&sname=$2&product=$3&h=$4 [L]
RewriteRule ^product/([^/]+)/([^/]+)/([0-9]+)/([^/]+) /index.php?page=product&cname=$1&sname=$2&product=$3&h=$4 [L]
Love PHP. Love CSS. Love learning new tricks too.
All the best from the United Kingdom.
All the best from the United Kingdom.
Re: Is there a HTACCESS string that stops duplicate pages?
Toss a $ on the end of the rule, causing the extra /hello to not match the rule and fall through to other rules.simonmlewis wrote:We also have an issue where if at the end of a product url, you enter /hello, the page still loads. This is the product htaccess:
RewriteRule ^product/([^/]+)/([^/]+)/([0-9]+)/([^/]+) /index.php?page=product&cname=$1&sname=$2&product=$3&h=$4 [L]
-
simonmlewis
- DevNet Master
- Posts: 4435
- Joined: Wed Oct 08, 2008 3:39 pm
- Location: United Kingdom
- Contact:
Re: Is there a HTACCESS string that stops duplicate pages?
Where exactly do you mean?
If I do it like this:
Or like this:
And put in a word after the final /, it still loads the page.
If I put it like this right at the end:
It still all loads.
Any ideas?
If I do it like this:
Code: Select all
RewriteRule ^product/([^/]+)/([^/]+)/([0-9]+)/([^/]+)$ /index.php?page=product&cname=$1&sname=$2&product=$3&h=$4 [L]Code: Select all
RewriteRule ^product/([^/]+)/([^/]+)/([0-9]+)/([^/]+)/$ /index.php?page=product&cname=$1&sname=$2&product=$3&h=$4 [L]If I put it like this right at the end:
Code: Select all
RewriteRule ^categ/([^/]+)/([0-9]+) /index.php?page=categ&cname=$1&pagenum=$2 [QSA] $Any ideas?
Love PHP. Love CSS. Love learning new tricks too.
All the best from the United Kingdom.
All the best from the United Kingdom.
Re: Is there a HTACCESS string that stops duplicate pages?
Must be another rule catching it, then.simonmlewis wrote:Where exactly do you mean?
If I do it like this:
Or like this:Code: Select all
RewriteRule ^product/([^/]+)/([^/]+)/([0-9]+)/([^/]+)$ /index.php?page=product&cname=$1&sname=$2&product=$3&h=$4 [L]And put in a word after the final /, it still loads the page.Code: Select all
RewriteRule ^product/([^/]+)/([^/]+)/([0-9]+)/([^/]+)/$ /index.php?page=product&cname=$1&sname=$2&product=$3&h=$4 [L]
.htaccess
[text]RewriteEngine on
RewriteRule ^product/([^/]+)/([^/]+)/([0-9]+)/([^/]+)$ /index.php?page=product&cname=$1&sname=$2&product=$3&h=$4 [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^ index.php [QSA,L][/text]
index.php
Code: Select all
<?php var_dump($_GET);[text]array (size=5)
'page' => string 'product' (length=7)
'cname' => string 'foo' (length=3)
'sname' => string 'bar' (length=3)
'product' => string '123' (length=3)
'h' => string 'baz' (length=3)[/text]
URI: /product/foo/bar/123/baz/hello
[text]array (size=0)
empty
[/text]
-
simonmlewis
- DevNet Master
- Posts: 4435
- Joined: Wed Oct 08, 2008 3:39 pm
- Location: United Kingdom
- Contact:
Re: Is there a HTACCESS string that stops duplicate pages?
This is our current live HTACCESS
There is a lot at the top to do with the double slashes.
Look in the NEW URL section.
There is a lot at the top to do with the double slashes.
Look in the NEW URL section.
Code: Select all
DirectoryIndex index.php index.html index.htm
order allow,deny
allow from all
Options +FollowSymLinks
Options +Indexes
RewriteEngine On
# Remove multiple slashes after domain
RewriteCond %{HTTP_HOST} !=""
RewriteCond %{THE_REQUEST} ^[A-Z]+\s//+(.*)\sHTTP/[0-9.]+$ [OR]
RewriteCond %{THE_REQUEST} ^[A-Z]+\s(.*/)/+\sHTTP/[0-9.]+$
RewriteRule .* http://%{HTTP_HOST}/%1 [R=301,L]
# Remove multiple slashes anywhere in URL
RewriteCond %{REQUEST_URI} ^(.*)//(.*)$
RewriteRule . %1/%2 [R=301,L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.*[^/])$ /$1/ [L,R=301]
RewriteRule ^(blog)($|/) - [QSA]
# old url rewrite
RewriteRule ^categ/([0-9]+)/([^/]+) /index.php?page=categ&c=$1&cname=$2 [QSA]
RewriteRule ^categ/page/([0-9]+)/([^/]+)/([0-9]+) /index.php?page=categ&c=$1&cname=$2&pagenum=$3 [QSA]
RewriteRule ^subcateg/([0-9]+)/([^/]+)/([0-9]+)/([^/]+) /index.php?page=subcateg&c=$1&cname=$2&s=$3&sname=$4&menu=sub [QSA]
RewriteRule ^subcateg/page/([0-9]+)/([^/]+)/([0-9]+)/([^/]+)/([0-9]+) /index.php?page=subcateg&c=$1&cname=$2&s=$3&sname=$4&pagenum=$5 [QSA]
RewriteRule ^product/([0-9]+)/([^/]+)/([0-9]+)/([^/]+)/([0-9]+)/([^/]+) /index.php?page=product&c=$1&cname=$2&s=$3&sname=$4&product=$5&h=$6 [QSA]
# end of old url rewrite
# NEW URLS
RewriteRule ^categ/([^/]+)/([0-9]+) /index.php?page=categ&cname=$1&pagenum=$2 [QSA]
RewriteRule ^categ/([^/]+) /index.php?page=categ&cname=$1 [QSA]
RewriteRule ^subcateg/([^/]+)/([^/]+)/([0-9]+) /index.php?page=subcateg&cname=$1&sname=$2&pagenum=$3 [QSA]
RewriteRule ^subcateg/([^/]+)/([^/]+) /index.php?page=subcateg&cname=$1&sname=$2 [QSA]
RewriteRule ^product/([^/]+)/([^/]+)/([0-9]+)/([^/]+) /index.php?page=product&cname=$1&sname=$2&product=$3&h=$4 [L]
# END OF NEW URLS
RewriteRule ^knowledge/([0-9]+) /index.php?page=knowledge&id=$1 [QSA]
RewriteRule ^knowledge/answer/([0-9]+)/([0-9]+) /index.php?page=knowledge&id=$1&id_link=$2 [QSA]
RewriteRule ^pricedrop/page/([0-9]+)/ /index.php?page=pricedrop&pagenum=$1 [QSA]
RewriteRule ^productsnew/page/([0-9]+)/ /index.php?page=productsnew&pagenum=$1 [QSA]
RewriteRule ^productzoom/([0-9]+)/([^/]+)/([0-9]+)/([^/]+)/([0-9]+)/([^/]+) /index.php?page=productzoom&c=$1&cname=$2&s=$3&sname=$4&product=$5&h=$6 [QSA]
RewriteRule ^loadout/([0-9]+)/([^/]+)/([0-9]+)/([^/]+)/([0-9]+)/([^/]+) /index.php?page=loadout&c=$1&cname=$2&s=$3&sname=$4&product=$5&h=$6 [QSA]
RewriteRule ^pricematch/([0-9]+) /index.php?page=pricematch&id=$1 [QSA]
RewriteRule ^type/([^/]+) /index.php?page=type&type=$1 [QSA]
RewriteRule ^use/([^/]+) /index.php?page=use&use=$1 [QSA]
RewriteRule ^back-in-stock/page/([0-9]+)/([^/]+) /index.php?page=back-in-stock&pagenum=$1&power=$2 [L,QSA]
RewriteRule ^productsall/([0-9]+)/ /index.php?page=productsall/ [QSA]
RewriteRule ^productsall/page/([0-9]+)/ /index.php?page=productsall&pagenum=$1/ [QSA]
RewriteRule ^manufacturers/([^/]+) /index.php?page=manufacturers&manufacturer=$1 [QSA]
RewriteRule ^accessories-manufacturers/([^/]+) /index.php?page=accessories-manufacturers&manufacturer=$1 [QSA]
RewriteRule ^product-tags/page/([0-9]+)/([^/]+) /index.php?page=product-tags&pagenum=$1&producttag=$2 [QSA]
RewriteRule ^product-tags/([^/]+) /index.php?page=product-tags&producttag=$1 [QSA]
RewriteRule ^products-wrapped/page/([0-9]+)/ /index.php?page=products-wrapped&pagenum=$1/ [QSA]
RewriteRule ^videos/([0-9]+) /index.php?page=videos&catid=$1 [QSA]
RewriteRule ^videos/product/([0-9]+)/([0-9]+) /index.php?page=videos&catid=$1&id=$2 [QSA]
RewriteRule ^videos/product-search/([0-9]+)/([0-9]+)/([^/]+)/([^/]+) /index.php?page=videos&catid=$1&id=$2&search=$3&searchvideo=$4 [QSA]
RewriteRule ^([^/\.]+)/?$ index.php?page=$1 [QSA]
RewriteRule ^$ index.php?page=home [QSA]
RewriteRule ^robots.txt$ robots.php [QSA]Love PHP. Love CSS. Love learning new tricks too.
All the best from the United Kingdom.
All the best from the United Kingdom.
Re: Is there a HTACCESS string that stops duplicate pages?
Adding /$ to the product rewrite rule results in a 404 if I add /hello to the URI. Shouldn't it?
Re: Is there a HTACCESS string that stops duplicate pages?
Also, this is a really long and convoluted .htaccess. You should really consider implementing some routing in PHP.