Get domain without www

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

jabbaonthedais
Forum Contributor
Posts: 127
Joined: Wed Aug 18, 2004 12:08 pm

Post by jabbaonthedais »

Kieran Huggins wrote:PCRE is faster than the ereg functions, and you can be a whole lot safer!

Code: Select all

$domain = preg_replace('#^(?:https?://)?(?:www\.)?(.*?)(?:/.*)?$','$1',$referer);
I couldn't get it to work, maybe because I'm not very knowledgeable about regex stuff. I tried breaking it down but couldn't figure it out. This works though... see any problems?

Code: Select all

preg_replace("/^(https?:\/\/www.)?([^\/\?]+)([\/|\?])?(.+)?$/", "\$2\n", $referer);
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

Code: Select all

$p = parse_url($url);
$host = (array_key_exists('host', $p) ? $p['host'] : '');
preg_match('#^(?:www\.)(.*?)$#i', $host, $match);
return $match[1];
potentially, but untested.
jabbaonthedais
Forum Contributor
Posts: 127
Joined: Wed Aug 18, 2004 12:08 pm

Post by jabbaonthedais »

feyd wrote:

Code: Select all

$p = parse_url($url);
$host = (array_key_exists('host', $p) ? $p['host'] : '');
preg_match('#^(?:www\.)(.*?)$#i', $host, $match);
return $match[1];
potentially, but untested.
That works great! I can't forsee any problems with it either because its just matching the www.

Thanks everyone! :)
User avatar
Kieran Huggins
DevNet Master
Posts: 3635
Joined: Wed Dec 06, 2006 4:14 pm
Location: Toronto, Canada
Contact:

Post by Kieran Huggins »

oops - I left out the trailing #...

Code: Select all

// returns 'domain.com'
$domain = preg_replace('#^(https?://)?(?:www\.)?(.*?)(/.*)?$#','$2',$referer);

// returns http://domain.com/whatever/people.php?bob=cool
$newURL = preg_replace('#^(https?://)?(?:www\.)?(.*?)(/.*)?$#','$1$2$3',$referer);
Incidentally, you don't need to run mine through parseURL() first - in fact, I've modified it so $1$2$3 will reconstruct the complete URL without the www. and just $2 will simply give you the domain portion.

I can't say enough good things about http://www.cuneytyilmaz.com/prog/jrx/ - a real time tester AND there's a handy mini-ref on the side.
Primarily for javascript as well, but it'll save you lots of old-school debugging ;-)
jabbaonthedais
Forum Contributor
Posts: 127
Joined: Wed Aug 18, 2004 12:08 pm

Post by jabbaonthedais »

Kieran, I put an "i" at the end to make it case insensative:

Code: Select all

$domain = preg_replace('#^(https?://)?(?:www\.)?(.*?)(/.*)?$#i','$2',$url);
How would I add a question mark to the end of the pattern? For url's like:
http://www.whatever.com?something=123
User avatar
stereofrog
Forum Contributor
Posts: 386
Joined: Mon Dec 04, 2006 6:10 am

Post by stereofrog »

jabbaonthedais, don’t let anyone confuse you. Your initial approach was 100% correct. To parse an url, you should use parse_url() and simple string functions. You do NOT need any regular expressions here.
jabbaonthedais
Forum Contributor
Posts: 127
Joined: Wed Aug 18, 2004 12:08 pm

Post by jabbaonthedais »

stereofrog wrote:jabbaonthedais, don’t let anyone confuse you. Your initial approach was 100% correct. To parse an url, you should use parse_url() and simple string functions. You do NOT need any regular expressions here.
But I still have to take out the "www." (if it exists) from the host section. I don't think trim() is meant to be used for this. I don't know how to "take it out" other than using something like preg_replace()
nickvd
DevNet Resident
Posts: 1027
Joined: Thu Mar 10, 2005 5:27 pm
Location: Southern Ontario
Contact:

Post by nickvd »

Wouldn't a simple str_replace('www.','',$host); do the trick?


(i didnt read the whole thread)
User avatar
RobertGonzalez
Site Administrator
Posts: 14293
Joined: Tue Sep 09, 2003 6:04 pm
Location: Fremont, CA, USA

Post by RobertGonzalez »

No. I had suggested that too. But someone brought up the point about what if the domain is somedomainwww.com? That would actually make it totally different.
nickvd
DevNet Resident
Posts: 1027
Joined: Thu Mar 10, 2005 5:27 pm
Location: Southern Ontario
Contact:

Post by nickvd »

Everah wrote:No. I had suggested that too. But someone brought up the point about what if the domain is somedomainwww.com? That would actually make it totally different.
Ick... didn't think of that... Please continue to ignore me! :D
User avatar
John Cartwright
Site Admin
Posts: 11470
Joined: Tue Dec 23, 2003 2:10 am
Location: Toronto
Contact:

Post by John Cartwright »

nickvd wrote:
Everah wrote:No. I had suggested that too. But someone brought up the point about what if the domain is somedomainwww.com? That would actually make it totally different.
Ick... didn't think of that... Please continue to ignore me! :D
Which is why you should read the whole thread :x
jabbaonthedais
Forum Contributor
Posts: 127
Joined: Wed Aug 18, 2004 12:08 pm

Post by jabbaonthedais »

Well, parsing the url takes care of the "www.whatever.com?stuff" problem. It may not be fast, but this is whats working right now:

Code: Select all

$parsed = parse_url($referer); 
$host = (array_key_exists('host', $parsed) ? $parsed['host'] : ''); 
preg_match('#^(?:www\.)(.*?)$#i', $host, $match); 
if (!isset($match[1])){
$match[1] = $parsed[host];
}
echo $match[1];
User avatar
stereofrog
Forum Contributor
Posts: 386
Joined: Mon Dec 04, 2006 6:10 am

Post by stereofrog »

jabbaonthedais wrote: But I still have to take out the "www." (if it exists) from the host section. I don't think trim() is meant to be used for this. I don't know how to "take it out" other than using something like preg_replace()
I thought I already answered that above. Here's the whole function yet again

Code: Select all

function get_host_name_without_www($url) {
	$p = parse_url($url);
	if(!isset($p['host']))
		return '';
	if(stripos($p['host'], "www.") === 0)
		return substr($p['host'], 4);
	return $p['host'];
}

echo get_host_name_without_www("http://www.blah.com/xyz");
Hope this helps.
jabbaonthedais
Forum Contributor
Posts: 127
Joined: Wed Aug 18, 2004 12:08 pm

Post by jabbaonthedais »

stereofrog wrote:I thought I already answered that above. Here's the whole function yet again
Sorry! Yes, that function works great. Tried it out on different urls with no problems. Thanks!

And thanks everyone!
Post Reply