Page 1 of 2
regular expression
Posted: Thu Feb 06, 2003 2:24 pm
by kendall
hey ppl,
'^[a-z0-9\-]+\.(com)|(net)|(org)|(biz)|(info)$'
the above is an expression that validates the format of a domain name thus 'anydomain.com' or biz etc is correct
now the thing is while i checked
http://www.apex-solu.com, kendall.co.tt and it debugged ok
http://www.kendall.biz did not
the theory behind the expression is
^[a-z0-9\-]+ \\ at the begin of the domain name from a-z0-9 and '-' repeated many times
\. \\ then a dot
(com)|(net)|(org)|(biz)|(info)$ \\ any 1 of the ext at the end
is my theory correct?
if not
can u advise me accordingly?
Kendall
ok
Posted: Thu Feb 06, 2003 3:33 pm
by AVATAr
that's ok!

Posted: Thu Feb 06, 2003 4:33 pm
by lazy_yogi
even these don't work
http://www.apex-solu.com, kendall.co.tt
I have no idea how they worked for you
what your reg exp does is check for any of these chars : [a-z0-9\-]
and then a dot and then any of these : (com)|(net)|(org)|(biz)|(info)
and then the end of line
this would be ok for
somedomain.com
but not for
http://www.somedomain.com
and not for
http://www.forums.somedomain.com
cuz they have extra full stops
you'd need to remove the begin of line charactore cuz subdomains have any number of full stops .. eg :
http://www.forums.domainname.co.uk
and you'll also need to remove the com/net/org/biz/info
since it could come from any one of the hundred plus countries
eg australia :
http://www.stuff.com.au
uk :
http://www.stuff.co.uk
this would work .. but is very loose in checking ... pointless i think
so it really doesn't check effectively anyway
if (preg_match('/[a-z0-9\-_]+\.(com)|(net)|(org)|(biz)|(info)$/', $dom))
print "yes";
else print "no";
theory
Posted: Thu Feb 06, 2003 6:04 pm
by AVATAr
jaja. i was answearing your question:
the theory behind the expression is
^[a-z0-9\-]+ \\ at the begin of the domain name from a-z0-9 and '-' repeated many times
\. \\ then a dot
(com)|(net)|(org)|(biz)|(info)$ \\ any 1 of the ext at the end
is my theory correct?
think how to add www. to the end... the ^ represent the "it start with" ...

Posted: Thu Feb 06, 2003 10:05 pm
by Stoker
I think you want something like
^[a-z0-9][a-z0-9\-]+\.(com|net|org|biz|info)$
instead perhaps? and use strtolower on your string to compare (that is more efficient than asking the perl regex engine to do case insensitive search).
well
Posted: Fri Feb 07, 2003 6:34 am
by AVATAr
here is the solution to de
http://www.something.com
'^www\.[a-z0-9]+\.(com|net|org|biz|info)$'
lets explain this a bit
^www\. -> the string start with www. (you have to scape de "."), then
[a-z0-9]+ -> any alfabetical character or number, 1 or more times, then
(com|net|org|biz|info)$ -> com or net or org or biz or info at the end.

regular expression
Posted: Fri Feb 07, 2003 7:12 am
by kendall
Uh,
u guys
i think there's a mis conception here
firstly i only want to search for the ext tlds that i listed
secondly i dont want them to put the www. infront
thus
http://www.apex-solutions.com suppose to be wrong
but apex-solutions.com would be rite
i think i'll try puting ^[^www.] in front which means (correct me if im wrong h ere) not starting with www.
ok?
Kendall
yep
Posted: Fri Feb 07, 2003 7:38 am
by AVATAr
you're right... use de ^[^www.]
but be aware of the use of the "." cause maybe you have tu escape it with "\."
good luck
Posted: Fri Feb 07, 2003 7:56 am
by Stoker
as I posted earlier,
^[a-z0-9][a-z0-9\-]+\.(com|net|org|biz|info)$
should work just fine, that will do this:
-start of string
-1 character a-z 0-9
-1 or more character a-z 0-9 or dash
-literal dot
-con or net or org or biz or info
-end of string
and [ ] is a character class, and ^ inside it is not so by using [^www.] you are telling it to not accept a string that starts with w, nor w nor w nor literal dot...
Your problem to begin with was wrong use of parenthesis (a|b|c), I added on the functionality of that it must be at least 2 characters long before the dot, and the first letter may not be a dash..
yep
Posted: Fri Feb 07, 2003 8:04 am
by AVATAr
you're right if you want to use www you have tu use [w]{3}
ups
Reguar Expressions
Posted: Fri Feb 07, 2003 8:56 am
by kendall
Ah,
Guys
well i got this to work
^[^(www)\.][a-z0-9\-]+\.(com|net|org|biz|info)$
in not accpeting www
but now it not even accepting wwf.com lol
even [w]{3} didnt work if i wanted a strict www on it wat to use
Kendall
P.S. i think yours wrong there as that doesnt relate to what im trying to do

Posted: Fri Feb 07, 2003 9:02 am
by AVATAr
use Stoker solution!
ereg('^[a-z0-9][a-z0-9\-]+\.(com|net|org|biz|info)$','web.com' );
Posted: Fri Feb 07, 2003 9:07 am
by Stoker
did you try the one that I posted twice now?
as I tried to say [] is what is called a character class, so what you made there does not make much sense
[^(www)\.]
NOT ( nor w nor w nor w nor ) nor dot
and btw inside character classes, the only two that need escapes are ] and -
edit/add: Hadn't seen avatars post before I posted, just want to add on that you should never ever use ereg unless there are a very specifical reason for it, use preg instead, a lot more efficient.
Regular Expression
Posted: Fri Feb 07, 2003 9:18 am
by kendall
OHHHH
forgive me stoker i was blind to what you were really trying to say as i thought you were trying to give me the expression to include the www. but then i have even blinded my self
as ^[a-z0-9\-]+\.(com|net|org|biz|info)$' works well as
^[a-z0-9][a-z0-9\-]+\.(com|net|org|biz|info)$'
which is what i originall had
Stoker really do appologise

quote
Posted: Fri Feb 07, 2003 9:31 am
by AVATAr
^[a-z0-9\-]+\.(com|net|org|biz|info)$'
will recognize "-web.com", with stoker solution you the first character it will be a letter or a number...