tristanlee85 wrote:Thank you for the reply! I wasn't going to go as far to look into balancing the quotes. I mean, Google doesn't automatically split my words when I type with without a space.
This works just like I was hoping. I'll take you up on your offer for explaining it if you would like. I've read tutorial after tutorial and RegEx is something I can't understand.
Hmmm, if you're not really comfortable with regex-es, I don't know if you're going
to grasp this fully. But I'll give it a shot:
So, this is the regex:
Code: Select all
'/\s+(?!([^'\"]*['\"][^'\"]*['\"])*[^'\"]*$)/'
In plain English, it would read like this:
Match one or more successive white space characters, only if those characters
DON'T have an even number of single- or double quotes in front of it all the way
to the end of the string.
A couple of basics:
Code: Select all
\s // Mathces a single white space character
X+ // One or more 'X'-s
X* // Zero or more 'X'-s
[XY] // Matches either 'X' or 'Y'
[^XY] // Matches any character except 'X' and 'Y'
X(?!Y) // Match the character 'X' only if there isn't a 'Y' ahead of it (so,
// it matches 'XQ' and 'XC' etc. but does not match 'XY'). This is
// called: 'negative look-ahead'.
$ // Meta character for the 'end of the string'
So, with the explanation above, you can piece together my original regex, here's
what it does:
Code: Select all
\s+ // Match one or more white space characters ...
(?! // start negative look-ahead
( // open group 1
[^'\"]* // zero or more characters of any type except single or double quotes
['\"] // one single or double quote
[^'\"]* // zero or more characters of any type except single or double quotes
['\"] // one single or double quote
) // close group 1
* // group 1 can occur zero or more times (in other words, quotes
// can only occur 0, 2, 4, 6, .. times, ie an even number of times)
[^'\"]* // zero or more characters of any type except single or double quotes
$ // the end of the string
) // stop negative look-ahead
The key lies in the fact that the end-of-string meta-character is anchored inside
the look-ahead. Removing that will cause the regex to match any white space that
has at least 2 quotes in front of it (so also white spaces with 3, 5, 7, ... quotes
in front of it).
But again: it's a tricky regex, so don't feel too bad if you don't fully grasp it (yet).
Good luck!