PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!
Moderator: General Moderators
jabbaonthedais
Forum Contributor
Posts: 127 Joined: Wed Aug 18, 2004 12:08 pm
Post
by jabbaonthedais » Thu Mar 22, 2007 4:13 pm
I'm trying to strip a domain from a long url, not including the www.
So for
http://www.whatever.com it would result "whatever.com".
But it also needs to work for other domain extensions, such as .co.uk, etc.
So
http://whatever.co.uk would result "whatever.co.uk"
I came up with this so far:
Code: Select all
$domain = parse_url($referer);
// take out the www dot
$trimmed = trim($domain[host], "www.");
echo $trimmed;
But, if my first letter in the domain is a W, it erases it also. Any ideas?
feyd
Neighborhood Spidermoddy
Posts: 31559 Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA
Post
by feyd » Thu Mar 22, 2007 4:23 pm
A regular expression or
strpos() could be of use.
jabbaonthedais
Forum Contributor
Posts: 127 Joined: Wed Aug 18, 2004 12:08 pm
Post
by jabbaonthedais » Thu Mar 22, 2007 9:43 pm
edit: Ok, this seems to be working fine:
Code: Select all
$domain = parse_url($referer);
// take out the www dot
$string = ereg_replace('www.', '', $domain[host]);
echo $string;
Do you see any negative results down the road with that? I put in quite a few urls and all seem to work.
Kieran Huggins
DevNet Master
Posts: 3635 Joined: Wed Dec 06, 2006 4:14 pm
Location: Toronto, Canada
Contact:
Post
by Kieran Huggins » Thu Mar 22, 2007 10:19 pm
PCRE is faster than the ereg functions, and you can be a whole lot safer!
Code: Select all
$domain = preg_replace('#^(?:https?://)?(?:www\.)?(.*?)(?:/.*)?$','$1',$referer);
feyd
Neighborhood Spidermoddy
Posts: 31559 Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA
Post
by feyd » Thu Mar 22, 2007 10:39 pm
What's wrong with
strpos() ? it's even faster still.
aaronhall
DevNet Resident
Posts: 1040 Joined: Tue Aug 13, 2002 5:10 pm
Location: Back in Phoenix, missing the microbrews
Contact:
Post
by aaronhall » Fri Mar 23, 2007 7:39 am
Code: Select all
if(stripos($domain['host'], 'www.')) {
$referer = str_ireplace('www.', '', $referer, 1);
}
jabbaonthedais
Forum Contributor
Posts: 127 Joined: Wed Aug 18, 2004 12:08 pm
Post
by jabbaonthedais » Fri Mar 23, 2007 10:30 am
Ok, this is what I've got now:
Code: Select all
$domain = parse_url($referer);
$newurl = $domain['host'];
// take out the www dot
$found = stripos($newurl, 'www.');
if ($found !== false) {
$newurl = str_ireplace('www.', '', $newurl);
}
aaronhall, I couldn't ever get that if statment to go off. No clue why. I took the ", 1" out of the end, made the host variable a real string, and still no luck.
Kieran Huggins
DevNet Master
Posts: 3635 Joined: Wed Dec 06, 2006 4:14 pm
Location: Toronto, Canada
Contact:
Post
by Kieran Huggins » Fri Mar 23, 2007 10:43 am
does my code not work? It might be marginally slower, but it's safer...
stereofrog
Forum Contributor
Posts: 386 Joined: Mon Dec 04, 2006 6:10 am
Post
by stereofrog » Fri Mar 23, 2007 11:20 am
jabbaonthedais wrote: Ok, this is what I've got now:
Code: Select all
$domain = parse_url($referer);
$newurl = $domain['host'];
// take out the www dot
$found = stripos($newurl, 'www.');
if ($found !== false) {
$newurl = str_ireplace('www.', '', $newurl);
}
There's no reason to use str_replace when position of the subject is exactly known. Just strip first 4 symbols off, that's all:
Code: Select all
$host = "www.xyz.com";
if(stripos($host, "www.") === 0) // note three =
$host = substr($host, 4);
echo $host;
RobertGonzalez
Site Administrator
Posts: 14293 Joined: Tue Sep 09, 2003 6:04 pm
Location: Fremont, CA, USA
Post
by RobertGonzalez » Fri Mar 23, 2007 11:27 am
That doesn't account for
http://www.something.com . Why san't you just do a straight str_replace on '
www .'. If it is there, it is removed. If it is not there, it is not removed because it can't be.
stereofrog
Forum Contributor
Posts: 386 Joined: Mon Dec 04, 2006 6:10 am
Post
by stereofrog » Fri Mar 23, 2007 11:33 am
My code is for hostnames only. For parsing full urls, parse_url() should be used first, as OP showed.
Why san't you just do a straight str_replace on '
www .'. If it is there, it is removed. If it is not there, it is not removed because it can't be.
this won't work for e.g. "mywww.com"
RobertGonzalez
Site Administrator
Posts: 14293 Joined: Tue Sep 09, 2003 6:04 pm
Location: Fremont, CA, USA
Post
by RobertGonzalez » Fri Mar 23, 2007 11:36 am
I think I like kieran's the best.
jabbaonthedais
Forum Contributor
Posts: 127 Joined: Wed Aug 18, 2004 12:08 pm
Post
by jabbaonthedais » Fri Mar 23, 2007 11:39 am
Kieran Huggins wrote: does my code not work? It might be marginally slower, but it's safer...
It's not working for me when I use it exactly as you put it... Where can I find a reference on those characters you use in searching? Like below:
Code: Select all
#^(?:https?:)?(?:www\.)?(.*?)(?:/.*)?$'