Page 1 of 1
preg_split on first occurence of whitespace
Posted: Fri Feb 08, 2008 10:54 pm
by snappca
I am desperately trying to split a string into two pieces at the first occurrence of a whitespace character (either a space or a \t), here's an example:
$fruit = preg_split('/\s/U', 'apple orange banana grape');
I'd like to have the pgrep_split give me an array populated like this:
$fruit[0] == 'apple';
$fruit[1] == 'orange banana grape';
Instead I keep getting an array split on every whitespace character. As you can see I've tried to set the regex as "ungreedy" using the "U" character. What am I missing....is there some other way that makes better sense?
Thanks in advance
Re: preg_split on first occurence of whitespace
Posted: Fri Feb 08, 2008 11:08 pm
by John Cartwright
Code: Select all
preg_match('/([^\s]+)(.*?)/i', 'apple orange banana grape', $fruit);
Untested..
Explanation: Match everything up the the first whitespace, then match everything else.
Re: preg_split on first occurence of whitespace
Posted: Fri Feb 08, 2008 11:34 pm
by snappca
Had to change the regex slightly, really it was just that the ? was throwing it off. In any case I appreciate the help. Thanks
ps - here's what I changed it to if anyone else cares:
preg_match('/([^\s]+)(.*)/', 'apple orange banana grape', $fruit);
Re: preg_split on first occurence of whitespace
Posted: Sat Feb 09, 2008 4:01 pm
by GeertDD
Here's another update:
Code: Select all
preg_match('/^(\S++)(.*)/', 'apple orange banana grape', $fruit);
- \S is shorter and means the same as [^\s].
- Added ^ to anchor the regex to the beginning of the string to prevent needless backtracking.
- Made \S match possessively (using ++). This kills possible needsless backtracking.
Finally, you can also use preg_split(). Just supply a limit (3rd parameter).
Code: Select all
preg_split('/\s+/', 'apple orange banana grape', 2);
Re: preg_split on first occurence of whitespace
Posted: Sat Feb 09, 2008 6:22 pm
by John Cartwright
GeertDD wrote:Here's another update:
Code: Select all
preg_match('/^(\S++)(.*)/', 'apple orange banana grape', $fruit);
- \S is shorter and means the same as [^\s].
- Added ^ to anchor the regex to the beginning of the string to prevent needless backtracking.
- Made \S match possessively (using ++). This kills possible needsless backtracking.
Finally, you can also use preg_split(). Just supply a limit (3rd parameter).
Code: Select all
preg_split('/\s+/', 'apple orange banana grape', 2);
Truly the king of regex.

Re: preg_split on first occurence of whitespace AND hyphen
Posted: Mon May 05, 2008 6:28 pm
by joeaston
I am trying to achieve the same thing but with matching
whitespace+hyphen+whitespace.
I've tried many variations on the following code, but I can't get it to work:
Code: Select all
$song = 'Explosions in the Sky – Day Four';
$matches = preg_split('/(\s\-\s+)/', $song, 2);
echo ' . $matches[1] . ' by ' . $matches[0];
// Should print 'Day Four by Explosions in the Sky'
// Instead $matches[1] outputs nothing, but $matches[0] outputs $song un-split
Please could someone explain what I'm doing wrong?
Thank you!
Re: preg_split on first occurence of whitespace AND hyphen
Posted: Tue May 06, 2008 1:41 am
by prometheuzz
joeaston wrote:I am trying to achieve the same thing but with matching
whitespace+hyphen+whitespace.
I've tried many variations on the following code, but I can't get it to work:
Code: Select all
$song = 'Explosions in the Sky – Day Four';
$matches = preg_split('/(\s\-\s+)/', $song, 2);
echo ' . $matches[1] . ' by ' . $matches[0];
// Should print 'Day Four by Explosions in the Sky'
// Instead $matches[1] outputs nothing, but $matches[0] outputs $song un-split
Please could someone explain what I'm doing wrong?
Thank you!
Look closely, your two hyphens are not the same. The one in $song is slightly larger.
Also, you don't need to group your regex (put it inside ( and )'s) and you don't need to escape the yphen inside the regex.
So, this shold work:
Code: Select all
$song = 'Explosions in the Sky - Day Four';
$matches = preg_split('/\s-\s/', $song, 2);
Or,if you want to match either one of those hyphens, and there may be more whitespace characters in front of, or after it, then this will do:
Code: Select all
$matches = preg_split('/\s+(-|–)\s+/', $song, 2);
Re: preg_split on first occurence of whitespace
Posted: Tue May 06, 2008 4:58 am
by joeaston
Thanks for trying prometheuzz, but that ain't working! The string still isn't being split.
I don't think it's the hyphen that's the problem. Here's my real code where the hyphen has been copied and pasted:
Code: Select all
$matches = preg_split('/\s–\s/', $item->get_title(), 2);
// $item->get_title() returns something like 'Monta – Long Live the Quiet' (no quotes)
Any other suggestions?
Re: preg_split on first occurence of whitespace
Posted: Tue May 06, 2008 5:09 am
by prometheuzz
joeaston wrote:Thanks for trying prometheuzz, but that ain't working!
...
Then there's probably more going wrong, but I am sure that the hyphens are different, and can cause problems.
This works perfectly for me:
Code: Select all
#!/usr/bin/php
<?php
print_r(preg_split('/\s+(-|–)\s+/', 'Explosions in the Sky - Day Four', 2)); // short hyphen
print_r(preg_split('/\s+(-|–)\s+/', 'Explosions in the Sky – Day Four', 2)); // longer hyphen
/* output:
Array
(
[0] => Explosions in the Sky
[1] => Day Four
)
Array
(
[0] => Explosions in the Sky
[1] => Day Four
)
*/
?>
Re: preg_split on first occurence of whitespace
Posted: Tue May 06, 2008 5:34 am
by joeaston
You were right!
I went on to Wikipedia and compiled a huge list of different hyphen types. That eventually got it working.
I've no idea which one is which though.
Code: Select all
$matches = preg_split('/\s+(-|?|?|–|—|?)\s+/', $item->get_title(), 2);
Thanks for your help.
Re: preg_split on first occurence of whitespace
Posted: Tue May 06, 2008 5:39 am
by prometheuzz
joeaston wrote:You were right!
I went on to Wikipedia and compiled a huge list of different hyphen types. That eventually got it working.
I've no idea which one is which though.
Code: Select all
$matches = preg_split('/\s+(-|?|?|–|—|?)\s+/', $item->get_title(), 2);
Thanks for your help.
I won't say "I told you so!"... Oh, dammit, now I did.
; )
You're welcome, of course.