regular expression

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

User avatar
kendall
Forum Regular
Posts: 852
Joined: Tue Jul 30, 2002 10:21 am
Location: Trinidad, West Indies
Contact:

regular expression

Post by kendall »

hey ppl,

'^[a-z0-9\-]+\.(com)|(net)|(org)|(biz)|(info)$'

the above is an expression that validates the format of a domain name thus 'anydomain.com' or biz etc is correct

now the thing is while i checked http://www.apex-solu.com, kendall.co.tt and it debugged ok http://www.kendall.biz did not

the theory behind the expression is

^[a-z0-9\-]+ \\ at the begin of the domain name from a-z0-9 and '-' repeated many times
\. \\ then a dot
(com)|(net)|(org)|(biz)|(info)$ \\ any 1 of the ext at the end

is my theory correct?

if not

can u advise me accordingly?

Kendall
User avatar
AVATAr
Forum Regular
Posts: 524
Joined: Tue Jul 16, 2002 4:19 pm
Location: Uruguay -- Montevideo
Contact:

ok

Post by AVATAr »

that's ok! :wink:
User avatar
lazy_yogi
Forum Contributor
Posts: 243
Joined: Fri Jan 24, 2003 3:27 am

Post by lazy_yogi »

even these don't work
http://www.apex-solu.com, kendall.co.tt
I have no idea how they worked for you

what your reg exp does is check for any of these chars : [a-z0-9\-]
and then a dot and then any of these : (com)|(net)|(org)|(biz)|(info)
and then the end of line

this would be ok for
somedomain.com
but not for
http://www.somedomain.com
and not for
http://www.forums.somedomain.com
cuz they have extra full stops


you'd need to remove the begin of line charactore cuz subdomains have any number of full stops .. eg :
http://www.forums.domainname.co.uk
and you'll also need to remove the com/net/org/biz/info
since it could come from any one of the hundred plus countries
eg australia : http://www.stuff.com.au
uk : http://www.stuff.co.uk

this would work .. but is very loose in checking ... pointless i think
so it really doesn't check effectively anyway

if (preg_match('/[a-z0-9\-_]+\.(com)|(net)|(org)|(biz)|(info)$/', $dom))
print "yes";
else print "no";
User avatar
AVATAr
Forum Regular
Posts: 524
Joined: Tue Jul 16, 2002 4:19 pm
Location: Uruguay -- Montevideo
Contact:

theory

Post by AVATAr »

jaja. i was answearing your question:
the theory behind the expression is

^[a-z0-9\-]+ \\ at the begin of the domain name from a-z0-9 and '-' repeated many times
\. \\ then a dot
(com)|(net)|(org)|(biz)|(info)$ \\ any 1 of the ext at the end

is my theory correct?
think how to add www. to the end... the ^ represent the "it start with" ...
:idea:
User avatar
Stoker
Forum Regular
Posts: 782
Joined: Thu Jan 23, 2003 9:45 pm
Location: SWNY
Contact:

Post by Stoker »

I think you want something like
^[a-z0-9][a-z0-9\-]+\.(com|net|org|biz|info)$
instead perhaps? and use strtolower on your string to compare (that is more efficient than asking the perl regex engine to do case insensitive search).
User avatar
AVATAr
Forum Regular
Posts: 524
Joined: Tue Jul 16, 2002 4:19 pm
Location: Uruguay -- Montevideo
Contact:

well

Post by AVATAr »

here is the solution to de http://www.something.com

'^www\.[a-z0-9]+\.(com|net|org|biz|info)$'

lets explain this a bit

^www\. -> the string start with www. (you have to scape de "."), then
[a-z0-9]+ -> any alfabetical character or number, 1 or more times, then
(com|net|org|biz|info)$ -> com or net or org or biz or info at the end.

:P
User avatar
kendall
Forum Regular
Posts: 852
Joined: Tue Jul 30, 2002 10:21 am
Location: Trinidad, West Indies
Contact:

regular expression

Post by kendall »

Uh,

u guys

i think there's a mis conception here

firstly i only want to search for the ext tlds that i listed
secondly i dont want them to put the www. infront
thus
http://www.apex-solutions.com suppose to be wrong
but apex-solutions.com would be rite

i think i'll try puting ^[^www.] in front which means (correct me if im wrong h ere) not starting with www.

ok?

Kendall
User avatar
AVATAr
Forum Regular
Posts: 524
Joined: Tue Jul 16, 2002 4:19 pm
Location: Uruguay -- Montevideo
Contact:

yep

Post by AVATAr »

you're right... use de ^[^www.]

but be aware of the use of the "." cause maybe you have tu escape it with "\."

good luck
User avatar
Stoker
Forum Regular
Posts: 782
Joined: Thu Jan 23, 2003 9:45 pm
Location: SWNY
Contact:

Post by Stoker »

as I posted earlier,
^[a-z0-9][a-z0-9\-]+\.(com|net|org|biz|info)$
should work just fine, that will do this:

-start of string
-1 character a-z 0-9
-1 or more character a-z 0-9 or dash
-literal dot
-con or net or org or biz or info
-end of string


and [ ] is a character class, and ^ inside it is not so by using [^www.] you are telling it to not accept a string that starts with w, nor w nor w nor literal dot...
Your problem to begin with was wrong use of parenthesis (a|b|c), I added on the functionality of that it must be at least 2 characters long before the dot, and the first letter may not be a dash..
User avatar
AVATAr
Forum Regular
Posts: 524
Joined: Tue Jul 16, 2002 4:19 pm
Location: Uruguay -- Montevideo
Contact:

yep

Post by AVATAr »

you're right if you want to use www you have tu use [w]{3}

ups
User avatar
kendall
Forum Regular
Posts: 852
Joined: Tue Jul 30, 2002 10:21 am
Location: Trinidad, West Indies
Contact:

Reguar Expressions

Post by kendall »

Ah,

Guys

well i got this to work

^[^(www)\.][a-z0-9\-]+\.(com|net|org|biz|info)$

in not accpeting www

but now it not even accepting wwf.com lol :lol:

even [w]{3} didnt work if i wanted a strict www on it wat to use

Kendall


P.S. i think yours wrong there as that doesnt relate to what im trying to do :wink:
User avatar
AVATAr
Forum Regular
Posts: 524
Joined: Tue Jul 16, 2002 4:19 pm
Location: Uruguay -- Montevideo
Contact:

Post by AVATAr »

use Stoker solution!

ereg('^[a-z0-9][a-z0-9\-]+\.(com|net|org|biz|info)$','web.com' );
User avatar
Stoker
Forum Regular
Posts: 782
Joined: Thu Jan 23, 2003 9:45 pm
Location: SWNY
Contact:

Post by Stoker »

did you try the one that I posted twice now?

as I tried to say [] is what is called a character class, so what you made there does not make much sense

[^(www)\.]
NOT ( nor w nor w nor w nor ) nor dot
and btw inside character classes, the only two that need escapes are ] and -

edit/add: Hadn't seen avatars post before I posted, just want to add on that you should never ever use ereg unless there are a very specifical reason for it, use preg instead, a lot more efficient.
User avatar
kendall
Forum Regular
Posts: 852
Joined: Tue Jul 30, 2002 10:21 am
Location: Trinidad, West Indies
Contact:

Regular Expression

Post by kendall »

OHHHH :oops:

forgive me stoker i was blind to what you were really trying to say as i thought you were trying to give me the expression to include the www. but then i have even blinded my self

as ^[a-z0-9\-]+\.(com|net|org|biz|info)$' works well as

^[a-z0-9][a-z0-9\-]+\.(com|net|org|biz|info)$'

which is what i originall had

Stoker really do appologise :lol:
User avatar
AVATAr
Forum Regular
Posts: 524
Joined: Tue Jul 16, 2002 4:19 pm
Location: Uruguay -- Montevideo
Contact:

quote

Post by AVATAr »

^[a-z0-9\-]+\.(com|net|org|biz|info)$'

will recognize "-web.com", with stoker solution you the first character it will be a letter or a number...
Post Reply