PHP Developers Network
http://forums.devnetwork.net/

Regex To Extract 2nd Level Domains From All TLDs ?
http://forums.devnetwork.net/viewtopic.php?f=38&t=143675
Page 1 of 1

Author:  UniqueIdeaMan [ Fri May 26, 2017 6:56 pm ]
Post subject:  Regex To Extract 2nd Level Domains From All TLDs ?

Good Day Folks!

1. Is the following regex ok to extract top level domains and 2nd level domains ?
[^.]*\.[^.]{2,3}(?:\.[^.]{2,3})?$

2. How to write php code to use that regex ?
Any sample code welcome.

Author:  UniqueIdeaMan [ Sat May 27, 2017 9:25 am ]
Post subject:  Re: Regex To Extract 2nd Level Domains From All TLDs ?

Guys,

I'm a complete beginner in regex and so any suitable tutorial suggestions for complete beginners are welcome too!

Anyway, as you know, different webpages would have different internal & external links all over their pages. No matter, what the link looks like, the domain should be extracted. Imagine, I'm running a web crawler, it would encounter unlimited links where some would have just domain and some subdomain and so on.
Eg.

http://domain.com
http://subdomain.domain.com


www.domain.com
http://www.domain.com


http://www.domain.com
http://subdomian.domain.com


domain.com/dir
subdomian.domain.com/dir

domain.com/dir/sub-dir
subdomian.domain.com/dir/sub-dir


Note: No matter how many subdomains or levels of domains (3rd level, 4th level, etc.) or dirs or sub-dirs (regardless of levels) the links contain, the 2nd level domain should be extracted along with it's tld.
From our examples above, the script should extract "domain.com" from all the above mentioned links.
I need an example of the php code too alongside the regex.

Page 1 of 1 All times are UTC - 5 hours
Powered by phpBB® Forum Software © phpBB Group
http://www.phpbb.com/