Page 1 of 1

Matching double words

Posted: Fri Aug 06, 2010 1:55 am
by trotsky
I am trying to match every single adjacent two words (not separated by punctuation).
Here is my code:

Code: Select all

$sentence = "How are you doing today? I am doing fine, why thank you.";
preg_match_all('#[a-z]+\s[a-z]+#i', $sentence, $doubleWordArray);
var_dump($doubleWordArray);
The results were:
array(1) { [0]=> array(5) { [0]=> string(7) "How are" [1]=> string(9) "you doing" [2]=> string(4) "I am" [3]=> string(10) "doing fine" [4]=> string(9) "why thank" } }


I don't understand why "are you" and "you doing" do not match.
How would I fix the regex?

Re: Matching double words

Posted: Fri Aug 06, 2010 10:56 am
by ridgerunner
Your regex is working correctly for me. I just copied and pasted your code into a test.php file and ran it. Here are the results I got:

Code: Select all

array(1) {
  [0]=>
  array(5) {
    [0]=>
    string(7) "How are"
    [1]=>
    string(9) "you doing"
    [2]=>
    string(4) "I am"
    [3]=>
    string(10) "doing fine"
    [4]=>
    string(9) "why thank"
  }
}
What version of PHP are you running? (Type in "php -v" to find out).

Re: Matching double words

Posted: Fri Aug 06, 2010 5:07 pm
by trotsky
I'm using php 5.3
I want the matches to be:

How are
are you
you doing
doing today
I am
am doing
doing fine
why thank
thank you

How would I fix my regex to give me that?

Re: Matching double words

Posted: Thu Aug 12, 2010 9:20 am
by prometheuzz
I don't understand why "are you" and "you doing" do not match.
That is because those words cannot be matched twice. Once you have matched them, they cannot occur in other groups. Unless you use look around assertions, then substrings can occur multiple times in different match-groups.

A demo:

Code: Select all

<?php
$text = 'How are you doing today? I am doing fine, why thank you.';
preg_match_all('/(?=(\w+\s+\w+))\w+/', $text, $matches);
print_r($matches[1]);
?>
which produces:

[text]Array
(
[0] => How are
[1] => are you
[2] => you doing
[3] => doing today
[4] => I am
[5] => am doing
[6] => doing fine
[7] => why thank
[8] => thank you
)[/text]