Page 2 of 2
Posted: Fri Mar 23, 2007 1:12 pm
by jabbaonthedais
Kieran Huggins wrote:PCRE is faster than the ereg functions, and you can be a whole lot safer!
Code: Select all
$domain = preg_replace('#^(?:https?://)?(?:www\.)?(.*?)(?:/.*)?$','$1',$referer);
I couldn't get it to work, maybe because I'm not very knowledgeable about regex stuff. I tried breaking it down but couldn't figure it out. This works though... see any problems?
Code: Select all
preg_replace("/^(https?:\/\/www.)?([^\/\?]+)([\/|\?])?(.+)?$/", "\$2\n", $referer);
Posted: Fri Mar 23, 2007 1:24 pm
by feyd
Code: Select all
$p = parse_url($url);
$host = (array_key_exists('host', $p) ? $p['host'] : '');
preg_match('#^(?:www\.)(.*?)$#i', $host, $match);
return $match[1];
potentially, but untested.
Posted: Fri Mar 23, 2007 1:51 pm
by jabbaonthedais
feyd wrote:Code: Select all
$p = parse_url($url);
$host = (array_key_exists('host', $p) ? $p['host'] : '');
preg_match('#^(?:www\.)(.*?)$#i', $host, $match);
return $match[1];
potentially, but untested.
That works great! I can't forsee any problems with it either because its just matching the www.
Thanks everyone!

Posted: Fri Mar 23, 2007 2:34 pm
by Kieran Huggins
oops - I left out the trailing #...
Code: Select all
// returns 'domain.com'
$domain = preg_replace('#^(https?://)?(?:www\.)?(.*?)(/.*)?$#','$2',$referer);
// returns http://domain.com/whatever/people.php?bob=cool
$newURL = preg_replace('#^(https?://)?(?:www\.)?(.*?)(/.*)?$#','$1$2$3',$referer);
Incidentally, you don't need to run mine through parseURL() first - in fact, I've modified it so $1$2$3 will reconstruct the complete URL
without the www. and just $2 will simply give you the domain portion.
I can't say enough good things about
http://www.cuneytyilmaz.com/prog/jrx/ - a real time tester AND there's a handy mini-ref on the side.
Primarily for javascript as well, but it'll save you lots of old-school debugging

Posted: Fri Mar 23, 2007 4:13 pm
by jabbaonthedais
Kieran, I put an "i" at the end to make it case insensative:
Code: Select all
$domain = preg_replace('#^(https?://)?(?:www\.)?(.*?)(/.*)?$#i','$2',$url);
How would I add a question mark to the end of the pattern? For url's like:
http://www.whatever.com?something=123
Posted: Sat Mar 24, 2007 5:09 am
by stereofrog
jabbaonthedais, don’t let anyone confuse you. Your initial approach was 100% correct. To parse an url, you should use parse_url() and simple string functions. You do NOT need any regular expressions here.
Posted: Sat Mar 24, 2007 2:53 pm
by jabbaonthedais
stereofrog wrote:jabbaonthedais, don’t let anyone confuse you. Your initial approach was 100% correct. To parse an url, you should use parse_url() and simple string functions. You do NOT need any regular expressions here.
But I still have to take out the "
www." (if it exists) from the host section. I don't think
trim() is meant to be used for this. I don't know how to "take it out" other than using something like
preg_replace()
Posted: Sat Mar 24, 2007 3:51 pm
by nickvd
Wouldn't a simple str_replace('www.','',$host); do the trick?
(i didnt read the whole thread)
Posted: Sat Mar 24, 2007 4:09 pm
by RobertGonzalez
No. I had suggested that too. But someone brought up the point about what if the domain is somedomainwww.com? That would actually make it totally different.
Posted: Sat Mar 24, 2007 4:14 pm
by nickvd
Everah wrote:No. I had suggested that too. But someone brought up the point about what if the domain is somedomainwww.com? That would actually make it totally different.
Ick... didn't think of that... Please continue to ignore me!

Posted: Sat Mar 24, 2007 4:18 pm
by John Cartwright
nickvd wrote:Everah wrote:No. I had suggested that too. But someone brought up the point about what if the domain is somedomainwww.com? That would actually make it totally different.
Ick... didn't think of that... Please continue to ignore me!

Which is why you should read the whole thread

Posted: Sun Mar 25, 2007 1:17 am
by jabbaonthedais
Well, parsing the url takes care of the "
www.whatever.com?stuff" problem. It may not be fast, but this is whats working right now:
Code: Select all
$parsed = parse_url($referer);
$host = (array_key_exists('host', $parsed) ? $parsed['host'] : '');
preg_match('#^(?:www\.)(.*?)$#i', $host, $match);
if (!isset($match[1])){
$match[1] = $parsed[host];
}
echo $match[1];
Posted: Sun Mar 25, 2007 4:52 am
by stereofrog
jabbaonthedais wrote:
But I still have to take out the "
www." (if it exists) from the host section. I don't think
trim() is meant to be used for this. I don't know how to "take it out" other than using something like
preg_replace()
I thought I already answered that above. Here's the whole function yet again
Code: Select all
function get_host_name_without_www($url) {
$p = parse_url($url);
if(!isset($p['host']))
return '';
if(stripos($p['host'], "www.") === 0)
return substr($p['host'], 4);
return $p['host'];
}
echo get_host_name_without_www("http://www.blah.com/xyz");
Hope this helps.
Posted: Sun Mar 25, 2007 11:37 am
by jabbaonthedais
stereofrog wrote:I thought I already answered that above. Here's the whole function yet again
Sorry! Yes, that function works great. Tried it out on different urls with no problems. Thanks!
And thanks everyone!