Page 1 of 1
Reg Expression Challenge - non-greedy type expression
Posted: Fri Feb 27, 2009 3:49 am
by hotflation
I am ready to bang my head against the wall trying to figure out a solution for my needs. Any help would be greatly appreciated by a regex expert.
What I'm trying to do is match a specific set of content within some tags. Take the example below where i want to match the "content3" within the tag <href=" and STUFF_3
#
<href="content1">STUFF_1<href="content2">STUFF_2<href="content3">STUFF_3
#
I cannot for the life of me figure out a way to structure a regex that'll capture that. Anything I do goes back to the longest match and non-greedy doesn't work.
Any help please?
Re: Reg Expression Challenge - non-greedy type expression
Posted: Fri Feb 27, 2009 3:57 am
by prometheuzz
I get the impression you are over-simplifying your problem in your post. Could give a bit more detail about what it exactly is you're trying match? It helps if you give (a part of) the actual input you're working with and clearly indicate what it is you're trying to find.
You might also want to post what you yourself have tried so far, I (or someone else) may point the error in your logic.
Re: Reg Expression Challenge - non-greedy type expression
Posted: Fri Feb 27, 2009 4:09 am
by hotflation
so what i'm matching is an encrypted string...
ex: 2USvF8CPsICBAZOTKTkvNT4tMn4J%2Fq%2BTsrUYm8cvr1%2FlaJiYl5%2BbmKkXloTf45m2jDSwZ3wNezQN7BoLJmr71YbY&oq=06oENya4ZGJbLUXW6oAQdBSLMEu2jGhZKLEO1eGqlLHuzkerb_nbSf1ybhBi1rrwx-8h0z7qng1tcFzvFmdPyrARy9tl51ZE49Lh-ItDFK230DtUl1E_so0_fPH7B7PKtkEAwKOzPzolCjM5WqTB6HLDUm2aIp7sS8s__esUaQ,YT0z
this is any variable content and bound to change.
what i've tried so far:
/((href=\")(.*))(\">$keyword<\/a>)/
where $keyword is STUFF_1, STUFF_2, etc...
i've tried non-greedy (.+?), and a bunch of other random things but nothing has worked. the expression above matches the whole string
<href="content1">STUFF_1<href="content2">STUFF_2<href="content3">STUFF_3
if $keyword is STUFF_3. it'll match <href="content1">STUFF_1<href="content2">STUFF_2 if $keyword is STUFF_2, etc. i need to only match the closet content.
i hope that helps.
Re: Reg Expression Challenge - non-greedy type expression
Posted: Fri Feb 27, 2009 4:14 am
by hotflation
Here's a sample text:
<tr><td><h3><a id="keyword" href="
http://hotflation/index.php?Query=2USvF ... YT0z">Demo Girl 1</a></h3></td></tr><tr><td><h3><a id="keyword" href="
http://hotflation/index.php?Query=2USvF ... YT0z">Demo Dude 2</a></h3>
I want to capture/match the content from the query in the 2nd term "demo dude 2" into a variable
Re: Reg Expression Challenge - non-greedy type expression
Posted: Fri Feb 27, 2009 4:14 am
by prometheuzz
Try this:
Code: Select all
"#href=\"([^\"]+)\"\s*>\s*$keyword\s*</a>#i"
Re: Reg Expression Challenge - non-greedy type expression
Posted: Fri Feb 27, 2009 1:50 pm
by hotflation
Thanks!! you are the man! it works perfectly...could you enlighten me and explain the expression?
I appreciate the help very much.
Re: Reg Expression Challenge - non-greedy type expression
Posted: Fri Feb 27, 2009 2:41 pm
by prometheuzz
hotflation wrote:Thanks!! you are the man! it works perfectly...could you enlighten me and explain the expression?
I appreciate the help very much.
No problem.
Code: Select all
href= // match 'href='
\" // match a double quote
( // start group 1
[^\"]+ // match one or more characters other than a double quote
) // end group 1
\" // match a double quote
\s* // match zero or more white space characters (also new line chars!)
> // match '>'
\s* // match zero or more white space characters
$keyword // match the contents of your keyword
\s* // match zero or more white space characters
</a> // match '</a>'