Regular Expressions and Split - Help needed please

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
User avatar
massiveone
Forum Commoner
Posts: 29
Joined: Tue Jun 18, 2002 4:39 pm
Location: Canada
Contact:

Regular Expressions and Split - Help needed please

Post by massiveone »

Two part problem

1)

I am populating a variable called $bodytext with text from emails, what i am trying to do is create an array which contains lines bases on key sperators ie who,what, I, If etc... basically anything that could start a sentence.
I tried the following to see about chopping up the text by sentence structure ending ie . , ? !

Code: Select all

$linearray=split('ї?.!,]', $bodytext);
That worked ok for a test. :lol:
I am however lacking regular expression skills.
I tried a small test using a single key text like "I " (notice i with a
space)

Code: Select all

$linearray=split('I ',$bodytext);
This works but i cant seem to find the corrext syntax to put in other key strings like who,what etc...

2)

The second problem is that the split seems to eat the characters/strings i am splitting by. when I do a print_r on the array i see that it has eaten them, any idea suggestions?
Thanks
kettle_drum
DevNet Resident
Posts: 1150
Joined: Sun Jul 20, 2003 9:25 pm
Location: West Yorkshire, England

...

Post by kettle_drum »

ummm,

You could try something like using strstr() to find the !,?, . etc and then use substr to read the string in.

But im sure an easier way would be to just use explode() and then add the character that it removed afterwards as you know that they will be at the end of the newly formed array of strings.

Check out http://regexlib.com/ for a nice library of reg expressions.
User avatar
patrikG
DevNet Master
Posts: 4235
Joined: Thu Aug 15, 2002 5:53 am
Location: Sussex, UK

Post by patrikG »

try

Code: Select all

preg_match_all("/\.^(\S)+/isU",$bodytext,$matches);
print_r($matches);
that should give you all words (note: "\S" means non-whitespace characters) at the beginning ("^") of a string. The results are stored in array $matches.

Haven't run it, but it should work.
Post Reply