"Clean URLs" and Search Engine Optimization
Moderator: General Moderators
"Clean URLs" and Search Engine Optimization
If you type "clean URLs" into Google, you come up with tons of information about how they supposedly make search engine ranking better. A lot of sites that discuss it even go as far as to say that search engines do not understand anything past the ? in the URL (this, I know is not true). My question is, what do you guys know about "clean urls" and their effect on search engine optimization?
For instance, does something like this:
http://www.website.com/page/100/relevant_keywords.html
Fare better in SEO than something like this:
http://www.website.com/page.php?id=100
Will a search engine see that your page is "named" relevant_keywords.html and use those keywords to add to your relevance?
I'm really looking for HARD facts on how and why you should use clean URLs for search engine optimization. I know that clean urls help in the case of usability and human readability, but that's not what I'm looking for. I'm just looking for information relating to search engine optimization. Thanks!
For instance, does something like this:
http://www.website.com/page/100/relevant_keywords.html
Fare better in SEO than something like this:
http://www.website.com/page.php?id=100
Will a search engine see that your page is "named" relevant_keywords.html and use those keywords to add to your relevance?
I'm really looking for HARD facts on how and why you should use clean URLs for search engine optimization. I know that clean urls help in the case of usability and human readability, but that's not what I'm looking for. I'm just looking for information relating to search engine optimization. Thanks!
Check out this article (I'm currently in the middle of reading it):
http://www.practicalecommerce.com/artic ... plex-URLs/
The article is from January 2006.
EDIT: Here is a little more useful info:
http://blog.outer-court.com/archive/200 ... 3917011691
http://www.practicalecommerce.com/artic ... plex-URLs/
If that is true, than that eliminates a lot of urls. What is more common than http://www.website.com/articles.php?id=123Matt Cutts, Senior Engineer for Google, said at Search Engine Strategies in San Jose this past August that you are safe if the number of variables in your URL is one or two, unless one of those two variables is named id (or something else resembling a session ID), in which case all bets are off.
The article is from January 2006.
EDIT: Here is a little more useful info:
http://blog.outer-court.com/archive/200 ... 3917011691
Last edited by Luke on Tue Feb 27, 2007 3:51 pm, edited 1 time in total.
Having keywords in your URL is an absolute must if you care a lot about SEO. I'm a web developer who's seen the improvement happen to many sites. As for my own sites, I had one based strictly on IDs in the URL (no keywords at all). I changed the URLs to be "clean" with keywords and a few months later the pages ranked much much higher with no new inbound links.
Here's a specific example:
http://www.google.com/search?q=link%3Aw ... tnG=Search
shows no external links to http://www.msversus.org/microsoft-marke ... -base.html
Yet a search for "microsoft market share" at http://www.google.com/search?hl=en&q=mi ... gle+Search
returns the page as the 6th link on the front page with 30+ million results. Of course I use other basic techniques (no trickery), but it basically shows how relevant the URL is.
Let's take another example. I have a wiki (see my sig) which would have URLs by default of /w/index.php?title=xyz. But "index" and "title" would basically be useless keywords before every title. So I use mod_rewrite to have /wiki/xyz.
It's similar to your domain name. If you have xyz.com you are far more likely to appear in searches for "xyz". The same is true of the rest of the URL, just to a lesser extent.
Here's a specific example:
http://www.google.com/search?q=link%3Aw ... tnG=Search
shows no external links to http://www.msversus.org/microsoft-marke ... -base.html
Yet a search for "microsoft market share" at http://www.google.com/search?hl=en&q=mi ... gle+Search
returns the page as the 6th link on the front page with 30+ million results. Of course I use other basic techniques (no trickery), but it basically shows how relevant the URL is.
Let's take another example. I have a wiki (see my sig) which would have URLs by default of /w/index.php?title=xyz. But "index" and "title" would basically be useless keywords before every title. So I use mod_rewrite to have /wiki/xyz.
It's similar to your domain name. If you have xyz.com you are far more likely to appear in searches for "xyz". The same is true of the rest of the URL, just to a lesser extent.
- RobertGonzalez
- Site Administrator
- Posts: 14293
- Joined: Tue Sep 09, 2003 6:04 pm
- Location: Fremont, CA, USA
- seodevhead
- Forum Regular
- Posts: 705
- Joined: Sat Oct 08, 2005 8:18 pm
- Location: Windermere, FL
Let me be the one to give the true "dish" on static vs. dynamic URL's.
Myth:
Search engines will not index dynamic content with multiple parameters.
As long as you don't have over three or four parameters in your URL... and don't use 'id' in any of your params (google might think it is a session ID - which then will not get indexed), google has no problem indexing or RANKING your pages, compared to a static version.
Truth:
Search engines are more inclined to crawl deeper into static content than dynamically generated content.
This is the big kicker that most people overlook. The problem with dynamically generated URLs (urls with params that is), is that google realizes that there may be unlimited number of values that could send their bots into a never ending crawl.
In other words, say you have a php application that pulls content based on a parameter called 'entry', which corresponds to say, a blog entry.
Example: http://www.example.com/blogs.php?entry=759
The problem is, Google comes to your site, sees blogs.php?=1 -> blogs.php?={potentially infinite} and will thus instruct their bots to limit their crawl depth. Reason being, Google doesn't know how many param/values there are to index and doesn't want to waste time crawling a seemingly endless supply of content. Google might say to itself... crawl the first 400 links, then just come back later and get maybe 10 or so more every X number of weeks.
Now with rewritten URLs... google doesn't know the data is never ending. I'm sure there is a way to tell it is dynamically generated by referencing some header info, but the bottom line is Google does not make the assumption that there "may" be a never ending linear path of indexing. Thus... you get a much deeper crawl and the bots return at much higher frequencies.
And that is why rewritten URLs are superior to dynamically generated (parameter) URLs. NOT because they are ranked better in the SERPs... but rather because they are crawled deeper and in a more frequent manner.
Not to mention URL cleanliness is always a plus.
Myth:
Search engines will not index dynamic content with multiple parameters.
As long as you don't have over three or four parameters in your URL... and don't use 'id' in any of your params (google might think it is a session ID - which then will not get indexed), google has no problem indexing or RANKING your pages, compared to a static version.
Truth:
Search engines are more inclined to crawl deeper into static content than dynamically generated content.
This is the big kicker that most people overlook. The problem with dynamically generated URLs (urls with params that is), is that google realizes that there may be unlimited number of values that could send their bots into a never ending crawl.
In other words, say you have a php application that pulls content based on a parameter called 'entry', which corresponds to say, a blog entry.
Example: http://www.example.com/blogs.php?entry=759
The problem is, Google comes to your site, sees blogs.php?=1 -> blogs.php?={potentially infinite} and will thus instruct their bots to limit their crawl depth. Reason being, Google doesn't know how many param/values there are to index and doesn't want to waste time crawling a seemingly endless supply of content. Google might say to itself... crawl the first 400 links, then just come back later and get maybe 10 or so more every X number of weeks.
Now with rewritten URLs... google doesn't know the data is never ending. I'm sure there is a way to tell it is dynamically generated by referencing some header info, but the bottom line is Google does not make the assumption that there "may" be a never ending linear path of indexing. Thus... you get a much deeper crawl and the bots return at much higher frequencies.
And that is why rewritten URLs are superior to dynamically generated (parameter) URLs. NOT because they are ranked better in the SERPs... but rather because they are crawled deeper and in a more frequent manner.
Not to mention URL cleanliness is always a plus.
- RobertGonzalez
- Site Administrator
- Posts: 14293
- Joined: Tue Sep 09, 2003 6:04 pm
- Location: Fremont, CA, USA
- RobertGonzalez
- Site Administrator
- Posts: 14293
- Joined: Tue Sep 09, 2003 6:04 pm
- Location: Fremont, CA, USA
http://www.site.com/page.php%ACpage=pag ... eninjagoat
or
http://www.site.com/page/some-title-to- ... ninjagoat/
Which one would you rather see, let alone try to remember%AC
or
http://www.site.com/page/some-title-to- ... ninjagoat/
Which one would you rather see, let alone try to remember%AC
Indeed. And I don't know if I'm the only one, but when I come to a site with /articles/coolsubject/some-article I often check out de url /articles/coolsubject/, just to see what's more on the site. Doesn't always work, but sometimes does.
And what about no URLs? I've seen and worked on a couple of sites were they removed the urls completely by inserting a frame! Not because of the functionality of the frame, but just to remove all urls... they tought it looked better
Luckily I could convince them to remove the frame.
And what about no URLs? I've seen and worked on a couple of sites were they removed the urls completely by inserting a frame! Not because of the functionality of the frame, but just to remove all urls... they tought it looked better
Luckily I could convince them to remove the frame.
I am not looking for reasons not to use clean urls, I'm just trying to find out the facts because I'm tired of reading 100 different "facts" about the issue that all seem to contradict eachother. I agree that they are 100% more user-friendly, and that is why I use them, but I want HARD FACTS about short urls and how and why the do or do not help in search engine optimization.
Here's a class I wrote (and use) to produce keyword rich, human readable urls from strings users submit.
IE: /blog/182/all-about-my-day-today
viewtopic.php?t=56724
I use it a lot. I also use statcounter.com which shows me which search engine and what queries brought users to my pages from search engines. Anytime I check out the query, the link listed that is mine has the keywords that were searched for in bold in the URL on google.
Whether this is just for appearance, or factored in the ranking, I do not know. However, I have a strong inclination to think it does factor in.
[edit] Sorry, no hard facts in my post either.
IE: /blog/182/all-about-my-day-today
viewtopic.php?t=56724
I use it a lot. I also use statcounter.com which shows me which search engine and what queries brought users to my pages from search engines. Anytime I check out the query, the link listed that is mine has the keywords that were searched for in bold in the URL on google.
Whether this is just for appearance, or factored in the ranking, I do not know. However, I have a strong inclination to think it does factor in.
[edit] Sorry, no hard facts in my post either.
Set Search Time - A google chrome extension. When you search only results from the past year (or set time period) are displayed. Helps tremendously when using new technologies to avoid outdated results.
- seodevhead
- Forum Regular
- Posts: 705
- Joined: Sat Oct 08, 2005 8:18 pm
- Location: Windermere, FL
scottayy brings up another VERY important point regarding clean URLs that contain keywords. Google bolds the keywords in the URLs that match your query... and this has been proven to increase click rates as it denotes to the searcher an added mark of relevancy.
Ninja, unfortunately I rarely ever remember where I first learned about various SEO theories, strategies or techniques. I spend so much time reading about SEO/SEM it could have come from any number of a thousand of places. But rest assured that I am not easy to persuade when it comes to SEO. Just remember, what maybe fact one day.. may be fiction tomorrow!
Ninja, unfortunately I rarely ever remember where I first learned about various SEO theories, strategies or techniques. I spend so much time reading about SEO/SEM it could have come from any number of a thousand of places. But rest assured that I am not easy to persuade when it comes to SEO. Just remember, what maybe fact one day.. may be fiction tomorrow!