Page 1 of 1

symbols for end of sentences

Posted: Wed Jan 16, 2008 1:21 pm
by varrg
Hi, I need a regex that splits a string on every delimiter that symbolizes the end of a sentence, at the moment used delimiters are:
. ! ? : ; and cause of rest of code requirements I need \n to be a delimiter to.

Also I need those to be able to be repeated, for example: ... should be treated as ONE delimiter not 3 of the same, same goes for ??? !!!!!! :::::: etcetera.

For the difficult part, I need smileys to be delimiters aswell, smileys such as :) :P :p :D :d :s :S =) =D =P =p
basically every common smiley.

At the moment I'm using this regex: /([.\n?!:;]+)+/ which as far as I've tested works fine, although the smileys are not included, I just don't know how to do it. Help!

Re: symbols for end of sentences

Posted: Wed Jan 16, 2008 1:32 pm
by Kieran Huggins
maybe try str_replace() and insert a special string (like *|*) as your sentence delimiter, then split on that?

make an array of what you consider delimiters, and a sister array of the same strings ending with your special delimiter. Then explode() on your delimiter.

Re: symbols for end of sentences

Posted: Wed Jan 16, 2008 1:37 pm
by varrg
yea but this script is meant to be implemented onto alot of things, forum threads, blog posts etcetera and banning | from being used kinda sucks..

Re: symbols for end of sentences

Posted: Wed Jan 16, 2008 1:48 pm
by Kieran Huggins
so what's the goal, exactly?

Re: symbols for end of sentences

Posted: Wed Jan 16, 2008 1:55 pm
by varrg
to split the string at every new sentence (read: not new line)
new sentences being symbolized by the requested delimiters

I am making a quotation system, to add a link before each new sentence that if you click it, you'll come to a page where it shows only that particular sentence

Re: symbols for end of sentences

Posted: Wed Jan 16, 2008 8:05 pm
by varrg
thanks alot guys, i finally found the right regex, thanks for all the help

#([:;=][|()/{}\[\]<>\\\odpsx]+|[.\n?!:;]+[\s]|[\n])#i