Page 1 of 2
Get domain without www
Posted: Thu Mar 22, 2007 4:13 pm
by jabbaonthedais
I'm trying to strip a domain from a long url, not including the www.
So for
http://www.whatever.com it would result "whatever.com".
But it also needs to work for other domain extensions, such as .co.uk, etc.
So
http://whatever.co.uk would result "whatever.co.uk"
I came up with this so far:
Code: Select all
$domain = parse_url($referer);
// take out the www dot
$trimmed = trim($domain[host], "www.");
echo $trimmed;
But, if my first letter in the domain is a W, it erases it also. Any ideas?
Posted: Thu Mar 22, 2007 4:23 pm
by feyd
A regular expression or
strpos() could be of use.
Posted: Thu Mar 22, 2007 5:15 pm
by Kieran Huggins
Posted: Thu Mar 22, 2007 9:43 pm
by jabbaonthedais
edit: Ok, this seems to be working fine:
Code: Select all
$domain = parse_url($referer);
// take out the www dot
$string = ereg_replace('www.', '', $domain[host]);
echo $string;
Do you see any negative results down the road with that? I put in quite a few urls and all seem to work.
Posted: Thu Mar 22, 2007 10:19 pm
by Kieran Huggins
PCRE is faster than the ereg functions, and you can be a whole lot safer!
Code: Select all
$domain = preg_replace('#^(?:https?://)?(?:www\.)?(.*?)(?:/.*)?$','$1',$referer);
Posted: Thu Mar 22, 2007 10:39 pm
by feyd
What's wrong with
strpos()? it's even faster still.
Posted: Fri Mar 23, 2007 7:39 am
by aaronhall
Code: Select all
if(stripos($domain['host'], 'www.')) {
$referer = str_ireplace('www.', '', $referer, 1);
}
Posted: Fri Mar 23, 2007 10:30 am
by jabbaonthedais
Ok, this is what I've got now:
Code: Select all
$domain = parse_url($referer);
$newurl = $domain['host'];
// take out the www dot
$found = stripos($newurl, 'www.');
if ($found !== false) {
$newurl = str_ireplace('www.', '', $newurl);
}
aaronhall, I couldn't ever get that if statment to go off. No clue why. I took the ", 1" out of the end, made the host variable a real string, and still no luck.
Posted: Fri Mar 23, 2007 10:43 am
by Kieran Huggins
does my code not work? It might be marginally slower, but it's safer...
Posted: Fri Mar 23, 2007 11:20 am
by stereofrog
jabbaonthedais wrote:Ok, this is what I've got now:
Code: Select all
$domain = parse_url($referer);
$newurl = $domain['host'];
// take out the www dot
$found = stripos($newurl, 'www.');
if ($found !== false) {
$newurl = str_ireplace('www.', '', $newurl);
}
There's no reason to use str_replace when position of the subject is exactly known. Just strip first 4 symbols off, that's all:
Code: Select all
$host = "www.xyz.com";
if(stripos($host, "www.") === 0) // note three =
$host = substr($host, 4);
echo $host;
Posted: Fri Mar 23, 2007 11:27 am
by RobertGonzalez
That doesn't account for
http://www.something.com. Why san't you just do a straight str_replace on '
www.'. If it is there, it is removed. If it is not there, it is not removed because it can't be.
Posted: Fri Mar 23, 2007 11:33 am
by stereofrog
My code is for hostnames only. For parsing full urls, parse_url() should be used first, as OP showed.
Why san't you just do a straight str_replace on '
www.'. If it is there, it is removed. If it is not there, it is not removed because it can't be.
this won't work for e.g. "mywww.com"
Posted: Fri Mar 23, 2007 11:36 am
by RobertGonzalez
I think I like kieran's the best.
Posted: Fri Mar 23, 2007 11:39 am
by jabbaonthedais
Kieran Huggins wrote:does my code not work? It might be marginally slower, but it's safer...
It's not working for me when I use it exactly as you put it... Where can I find a reference on those characters you use in searching? Like below:
Code: Select all
#^(?:https?:)?(?:www\.)?(.*?)(?:/.*)?$'
Posted: Fri Mar 23, 2007 11:42 am
by RobertGonzalez
http://www.devguru.com/Technologies/ecm ... cters.html
It is primarily for javascript, but actually works pretty well as an explanation of regular expressions. I think d11wtq also wrote a tutorial on regex oin the regular expressions forum.