Looking for matches relegated to first n words
Posted: Tue Mar 04, 2008 11:01 am
I am attepting to match a series of "words" with the caveat being they must appear in the first (and in other cases) last words in the string. I am working with grammar tags, so an example string would be "PN VB SPN VB DT GN NNS IN RB . SPN VBP IT FVBP SPN ."
An example of a desire match would be if PN or NN appear in the first 8 words of the string.
I have tried "^(\b\w+\b\s*){0,7}\b(CC|IN|TO)\b", it works fine for identifying if "CC" or "IN" or "TO" appear in the first 8 words. What I need to do is also cut if that is found, such as...
If in the first nine words of a sentence I find a (CC |IN |TO) I would then like to cut (CC |IN | TO ) + (FVBP | VBD | VBN | VBZ) + IN if it is found. Is this possible in one REGEX statement?
Any help would be appreciated greatly.
An example of a desire match would be if PN or NN appear in the first 8 words of the string.
I have tried "^(\b\w+\b\s*){0,7}\b(CC|IN|TO)\b", it works fine for identifying if "CC" or "IN" or "TO" appear in the first 8 words. What I need to do is also cut if that is found, such as...
If in the first nine words of a sentence I find a (CC |IN |TO) I would then like to cut (CC |IN | TO ) + (FVBP | VBD | VBN | VBZ) + IN if it is found. Is this possible in one REGEX statement?
Any help would be appreciated greatly.