Page 1 of 1

take anything at the end of string with some rules

Posted: Sun Jan 06, 2008 10:31 pm
by blackout
Hi,

I'm creating a script that allows user to pass a parameter string:

anystring_here%var%
or
here_is_the_string%var%_

I will replace the %var% to (.*) so those strings above will be a regex:

anystring_here(.*)
or
here_is_the_string(.*)_

thus, when I preg_match those regex to some text, it can take the %var% portion.

text example:
anystring_hereXYZABC -> will take 'XYZABC' with first regex

here_is_the_stringABCDE_more -> will take 'ABCDE' with 2nd regex

the problem with 2nd regex, if the text like this:

here_is_the_stringABCDE_more_and_more -> it will take 'ABCDE_more_and'

I know that I need to put (.*?) to stop greedy fetching, but if it applies on 1st regex, it won't take anything.

anystring_here(.*?) won't take 'XYZABC' from anystring_hereXYZABC

so, is there any solution to get both regex's work?

Thanks.

Posted: Mon Jan 07, 2008 1:15 pm
by vapoorize
confused by your examples:
Read http://regexadvice.com/blogs/mash/default.aspx">
and http://regexadvice.com/forums/thread/37499.aspx
to write better regex questions


give haystacks
give the expected matches in each haystack

Posted: Tue Jan 08, 2008 3:56 am
by blackout
Okay, I thought I gave a complete problem there.

Let's start from this:

Source:

here_is_the_string_XYZABC

or

here_is_the_string_XYZABC_blabla

or

here_is_the_string_XYZABC_bla_bla

I want to get 'XYZABC' only, how's the regex?

thanks.

Posted: Tue Jan 08, 2008 7:42 pm
by vapoorize

Code: Select all

<?php
$hay = 'here_is_the_string_XYZABC

or

here_is_the_string_XYZABC_blabla

or

here_is_the_string_XYZABC_bla_bla';
$pat = '~here_is_the_string_(XYZABC)~';
preg_match_all($pat, $raw, $out);
echo '<pre>';
print_r($out[1]);
echo '</pre>';
?>

Posted: Tue Jan 08, 2008 9:03 pm
by feyd
~vapoorize, you have waiting messages from a moderator in your private message box. You need to read them.

Posted: Tue Jan 08, 2008 11:38 pm
by blackout
Thanks for your reply, vapoorize, i change your code to suit my need and to explain my original problem:

Code: Select all

<?php
$hays[] = 'here_is_the_string_ABCDEF';
$hays[] = 'here_is_the_string_DEFABC_blabla';
$hays[] = 'here_is_the_string_XYZABC_bla_bla';

$pat = '~here_is_the_string_(.*)~';

foreach($hays as $hay)
  preg_match_all($pat, $hay, $out); 
?>
the code above will result:

ABCDEF
DEFABC_blabla
XYZABC_bla_bla

but what I want is the result like this:

ABCDEF
DEFABC
XYZABC

what needs to be changed in the pattern?

thanks.

Posted: Wed Jan 09, 2008 12:02 am
by vapoorize
Regexes are SPECIFIC to the data, did you read the links I posted earlier? I doubt it. You're lucky I'm not completely ignoring your post, but I figure I would repeat myself again 1 more time and maybe you'll get the hint. Post some real data.

See a better question would be to specify some rules or criteria of what should be matched, you didn't. So I'll just look at your data and come up with my own criteria. It looks like you want the capital letters in the haystack to be matched, so:

~[A-Z]~

Your attempt:
~here_is_the_string_(.*)~

the * symbol is greedy, it will never stop eating, you could make it lazy or maybe add some kind of rule to determine where it will stop eating, but then again this criteria is unknown
Read: http://www.regular-expressions.info/repeat.html

It looks like you want to match this:
~here_is_the_string_[^_]*~

Re: take anything at the end of string with some rules

Posted: Fri Jan 11, 2008 8:44 am
by blackout
Look, I was trying to explain my problem in the first post, it's completely there but you said you're confused so I tried to slice it with a hope we can end up solving my real problem (again as I said in 1st post).

You always say "give the real data, give the haystacks", they were there:
$hays[] = 'here_is_the_string_ABCDEF';
$hays[] = 'here_is_the_string_DEFABC_blabla';
$hays[] = 'here_is_the_string_XYZABC_bla_bla';
it represents my real data and yes, I can say it's my real data, or do I need to create so many examples? I'm asking about regex, so my examples above is representing the pattern of my real data. When I asked "how to get 'ABCDEF' ?", it means how to get that part from that string's pattern because I'm asking regex here, if I'm not asking about regex, maybe somebody will answer "use substr!", case closed.

I know regex is specific to the data, but please read my 1st post:
I'm creating a script that allows user to pass a parameter string:

anystring_here%var%
or
here_is_the_string%var%_
lastly, you're telling me about (.*) is greedy, I did state it in my 1st post, it seems you who didn't read my question and then push me to ask about another thing instead of my original question.

Hey, sorry to say that, I AM thanking for your willing to reply here, but you seem to dictate too much here.