Page 1 of 1

Regex to deny sentence

Posted: Thu Sep 06, 2012 10:12 am
by magaupe
Hi guys,

I need to find any URL (in red) inside this sentence:

<link rel="image_src" href="any URL"

But it cannot be at any circunstante this one:

<link rel="image_src" href="http://s1. trrsf. com .br/atm/3/core/_img/terra-logo-white-bg-v2 .jpg"

I'm using this regex but it finds everything:

<link rel="image_src" href="([^\"]+)"

Any advice?

Re: Regex to deny sentence

Posted: Thu Sep 06, 2012 12:23 pm
by requinix
So... if the href you matched was that one you don't want, skip it.

Re: Regex to deny sentence

Posted: Thu Sep 06, 2012 12:32 pm
by magaupe
requinix wrote:So... if the href you matched was that one you don't want, skip it.
exactly.

Re: Regex to deny sentence

Posted: Thu Sep 06, 2012 12:41 pm
by requinix
So what's the question? Or what code do you have?

Re: Regex to deny sentence

Posted: Thu Sep 06, 2012 12:47 pm
by magaupe
requinix wrote:So what's the question? Or what code do you have?
I have this code but it matchs any href without distiction.
<link rel="image_src" href="([^\"]+)"

it should not match the URL: http://s1. trrsf. com .br/atm/3/core/_img/terra-logo-white-bg-v2 .jpg.

Re: Regex to deny sentence

Posted: Thu Sep 06, 2012 2:15 pm
by requinix
And I'm telling you the easiest option: let it match whatever it wants to match and make the rest of the code skip the match if you don't want it.

Code: Select all

for each $href in all the hrefs the regex matched {
    if $href is something you don't want to include {
        continue looking at the next href
    } otherwise {
        do whatever
    }
}
There may be a perfectly legitimate reason why that won't work for your circumstance but I haven't heard it yet.

Re: Regex to deny sentence

Posted: Thu Sep 06, 2012 2:17 pm
by magaupe
requinix wrote:And I'm telling you the easiest option: let it match whatever it wants to match and make the rest of the code skip the match if you don't want it.

Code: Select all

for each $href in all the hrefs the regex matched {
    if $href is something you don't want to include {
        continue looking at the next href
    } otherwise {
        do whatever
    }
}
There may be a perfectly legitimate reason why that won't work for your circumstance but I haven't heard it yet.
I'd like to do this in a single regex line. Not using any kind of iteration.
Is that possible?

Re: Regex to deny sentence

Posted: Thu Sep 06, 2012 3:03 pm
by requinix
Yeah, but aren't you going to need iteration somewhere? What are you doing with these hrefs?

Code: Select all

<link rel="image_src" href="(?!url you don't want)([^\"]+)"
Remember to escape characters like . and /.

Re: Regex to deny sentence

Posted: Mon Sep 10, 2012 7:13 am
by magaupe
requinix wrote:Yeah, but aren't you going to need iteration somewhere? What are you doing with these hrefs?

Code: Select all

<link rel="image_src" href="(?!url you don't want)([^\"]+)"
Remember to escape characters like . and /.
Not really, I just need this to extract a image from a web page.
Is this right? It won't recognize some of the characteres.

Code: Select all

<link rel="image_src" href="(?!http://s1.trrsf.com.br/atm/3/core/_img/terra-logo-white-bg-v2.jpg)([^\"]+)"
I'm using Testrexp to test it. This is the error: "TRegExpr(comp): Urecongnized Modifier (pos 96)"

Re: Regex to deny sentence

Posted: Mon Sep 10, 2012 1:54 pm
by requinix
Where is offset 96?

Also,
requinix wrote:Remember to escape characters like . and /.

Re: Regex to deny sentence

Posted: Mon Sep 10, 2012 3:18 pm
by magaupe
Actually It must be compatible cause the application I'm using it is based on TestRExp (http:// regexpstudio. com/RegExpStudio.html).
Thanks anyway.