Looking for matches relegated to first n words

Any questions involving matching text strings to patterns - the pattern is called a "regular expression."

Moderator: General Moderators

Post Reply
cspackler
Forum Newbie
Posts: 2
Joined: Tue Mar 04, 2008 10:56 am

Looking for matches relegated to first n words

Post by cspackler »

I am attepting to match a series of "words" with the caveat being they must appear in the first (and in other cases) last words in the string. I am working with grammar tags, so an example string would be "PN VB SPN VB DT GN NNS IN RB . SPN VBP IT FVBP SPN ."

An example of a desire match would be if PN or NN appear in the first 8 words of the string.

I have tried "^(\b\w+\b\s*){0,7}\b(CC|IN|TO)\b", it works fine for identifying if "CC" or "IN" or "TO" appear in the first 8 words. What I need to do is also cut if that is found, such as...

If in the first nine words of a sentence I find a (CC |IN |TO) I would then like to cut (CC |IN | TO ) + (FVBP | VBD | VBN | VBZ) + IN if it is found. Is this possible in one REGEX statement?

Any help would be appreciated greatly.
User avatar
hawkenterprises
Forum Commoner
Posts: 54
Joined: Thu Feb 28, 2008 9:56 pm
Location: gresham,oregon
Contact:

Re: Looking for matches relegated to first n words

Post by hawkenterprises »

I'm not sure what size of text strings your working with but it might be easier to just explode via space and check the first 7 words for matches with a for statement.

Code: Select all

 
$var = 'N VB SPN VB DT GN NNS IN RB';
$patterns = array('NN','CC','TO');
$explodedvar = explode(' ',$var);
for($i=0;$i<7;$i++){
      if(in_array($explodedvar[$i],$patterns)){
              echo 'FOUND';
      }
      else echo 'NOT FOUND';
}
 
cspackler
Forum Newbie
Posts: 2
Joined: Tue Mar 04, 2008 10:56 am

Re: Looking for matches relegated to first n words

Post by cspackler »

These are grammar tags from user entered text, so they strings will not be huge. Due to the way the system works, I need to find out if I can accomplis it in one regex statement. Basically, I need to match and pull a desired string, but only if it occurs in the first 9 words.
Post Reply