Need help with regex for all caps followed by lower case
Posted: Tue Sep 22, 2009 1:45 pm
Regular expression needed. I'm working in PHP.
I'm trying to parse some text that contains an all-caps heading, followed by regular case words. It has an all-capital-letter-headline, followed by 1 or more lower case words. The first match should grab the all caps headline and the lower case text until it reaches the next all-caps headline, and so forth. An all caps headline should contain at least one word that has two or more all capital letters. So it would not match the single capital letter that starts a sentence. But the regex should also be smart enough to handle an all caps headline that starts with or contains a single all cap letter: THIS IS A HEADLINE should match. The headline and the text that follow may also contain white space characters.
Example:
THIS IS AN ALL CAPS HEADLINE followed by
some text like this. Just some more regular text here. Just some more regular text here. Just some more regular text here. Just some more regular text here. Just some more regular text here. Just some more regular text here. THEN ANOTHER ALL CAPS HEADLINE GOES HERE followed by even more text.
I want two matches:
(1) THIS IS AN ALL CAPS HEADLINE followed by
some text like this. Just some more regular text here. Just some more regular text here. Just some more regular text here. Just some more regular text here. Just some more regular text here. Just some more regular text here.
(2) THEN ANOTHER ALL CAPS HEADLINE GOES HERE followed by even more text.
Thanks for any help!
Tim
I'm trying to parse some text that contains an all-caps heading, followed by regular case words. It has an all-capital-letter-headline, followed by 1 or more lower case words. The first match should grab the all caps headline and the lower case text until it reaches the next all-caps headline, and so forth. An all caps headline should contain at least one word that has two or more all capital letters. So it would not match the single capital letter that starts a sentence. But the regex should also be smart enough to handle an all caps headline that starts with or contains a single all cap letter: THIS IS A HEADLINE should match. The headline and the text that follow may also contain white space characters.
Example:
THIS IS AN ALL CAPS HEADLINE followed by
some text like this. Just some more regular text here. Just some more regular text here. Just some more regular text here. Just some more regular text here. Just some more regular text here. Just some more regular text here. THEN ANOTHER ALL CAPS HEADLINE GOES HERE followed by even more text.
I want two matches:
(1) THIS IS AN ALL CAPS HEADLINE followed by
some text like this. Just some more regular text here. Just some more regular text here. Just some more regular text here. Just some more regular text here. Just some more regular text here. Just some more regular text here.
(2) THEN ANOTHER ALL CAPS HEADLINE GOES HERE followed by even more text.
Thanks for any help!
Tim