extract domains in string

Any questions involving matching text strings to patterns - the pattern is called a "regular expression."

Moderator: General Moderators

Post Reply
User avatar
yacahuma
Forum Regular
Posts: 870
Joined: Sun Jul 01, 2007 7:11 am

extract domains in string

Post by yacahuma »

I have a string that look like this
domain1.netdomain2.orgdomain3.infodomain4.com

I want to get a nice array like
domain1.net
domain2.org
domain3.info
domain4.com

So far this is what I got

Code: Select all

 $domains = preg_split("/\.org|\.com|\.net|\.info/", strtolower($string));
the problem I just get domains with no extension.
domain1
domain2
domain3
domain4


My regex power are pathetic.
User avatar
John Cartwright
Site Admin
Posts: 11470
Joined: Tue Dec 23, 2003 2:10 am
Location: Toronto
Contact:

Re: extract domains in string

Post by John Cartwright »

You might have a bit of fun supporting subdomains, atleast without using some kind of delimeter (which is beyond me why your forced into parsing such a unflexible format).

Code: Select all

preg_match_all("~[a-z0-9]+\.(org|com|net|info)~i", strtolower($string), $matches);
User avatar
yacahuma
Forum Regular
Posts: 870
Joined: Sun Jul 01, 2007 7:11 am

Re: extract domains in string

Post by yacahuma »

works great. thank you. I was doing some page scraping. Thats what I got from xpath.
Post Reply