Page 1 of 1

Regular Expressions and Split - Help needed please

Posted: Wed Jul 23, 2003 10:36 am
by massiveone
Two part problem

1)

I am populating a variable called $bodytext with text from emails, what i am trying to do is create an array which contains lines bases on key sperators ie who,what, I, If etc... basically anything that could start a sentence.
I tried the following to see about chopping up the text by sentence structure ending ie . , ? !

Code: Select all

$linearray=split('ї?.!,]', $bodytext);
That worked ok for a test. :lol:
I am however lacking regular expression skills.
I tried a small test using a single key text like "I " (notice i with a
space)

Code: Select all

$linearray=split('I ',$bodytext);
This works but i cant seem to find the corrext syntax to put in other key strings like who,what etc...

2)

The second problem is that the split seems to eat the characters/strings i am splitting by. when I do a print_r on the array i see that it has eaten them, any idea suggestions?
Thanks

...

Posted: Wed Jul 23, 2003 2:26 pm
by kettle_drum
ummm,

You could try something like using strstr() to find the !,?, . etc and then use substr to read the string in.

But im sure an easier way would be to just use explode() and then add the character that it removed afterwards as you know that they will be at the end of the newly formed array of strings.

Check out http://regexlib.com/ for a nice library of reg expressions.

Posted: Wed Jul 23, 2003 3:07 pm
by patrikG
try

Code: Select all

preg_match_all("/\.^(\S)+/isU",$bodytext,$matches);
print_r($matches);
that should give you all words (note: "\S" means non-whitespace characters) at the beginning ("^") of a string. The results are stored in array $matches.

Haven't run it, but it should work.