Matching double words

Any questions involving matching text strings to patterns - the pattern is called a "regular expression."

Moderator: General Moderators

Post Reply
trotsky
Forum Newbie
Posts: 4
Joined: Sat Jul 31, 2010 2:16 pm

Matching double words

Post by trotsky »

I am trying to match every single adjacent two words (not separated by punctuation).
Here is my code:

Code: Select all

$sentence = "How are you doing today? I am doing fine, why thank you.";
preg_match_all('#[a-z]+\s[a-z]+#i', $sentence, $doubleWordArray);
var_dump($doubleWordArray);
The results were:
array(1) { [0]=> array(5) { [0]=> string(7) "How are" [1]=> string(9) "you doing" [2]=> string(4) "I am" [3]=> string(10) "doing fine" [4]=> string(9) "why thank" } }


I don't understand why "are you" and "you doing" do not match.
How would I fix the regex?
Last edited by trotsky on Fri Aug 06, 2010 8:45 pm, edited 1 time in total.
User avatar
ridgerunner
Forum Contributor
Posts: 214
Joined: Sun Jul 05, 2009 10:39 pm
Location: SLC, UT

Re: Matching double words

Post by ridgerunner »

Your regex is working correctly for me. I just copied and pasted your code into a test.php file and ran it. Here are the results I got:

Code: Select all

array(1) {
  [0]=>
  array(5) {
    [0]=>
    string(7) "How are"
    [1]=>
    string(9) "you doing"
    [2]=>
    string(4) "I am"
    [3]=>
    string(10) "doing fine"
    [4]=>
    string(9) "why thank"
  }
}
What version of PHP are you running? (Type in "php -v" to find out).
trotsky
Forum Newbie
Posts: 4
Joined: Sat Jul 31, 2010 2:16 pm

Re: Matching double words

Post by trotsky »

I'm using php 5.3
I want the matches to be:

How are
are you
you doing
doing today
I am
am doing
doing fine
why thank
thank you

How would I fix my regex to give me that?
User avatar
prometheuzz
Forum Regular
Posts: 779
Joined: Fri Apr 04, 2008 5:51 am

Re: Matching double words

Post by prometheuzz »

I don't understand why "are you" and "you doing" do not match.
That is because those words cannot be matched twice. Once you have matched them, they cannot occur in other groups. Unless you use look around assertions, then substrings can occur multiple times in different match-groups.

A demo:

Code: Select all

<?php
$text = 'How are you doing today? I am doing fine, why thank you.';
preg_match_all('/(?=(\w+\s+\w+))\w+/', $text, $matches);
print_r($matches[1]);
?>
which produces:

[text]Array
(
[0] => How are
[1] => are you
[2] => you doing
[3] => doing today
[4] => I am
[5] => am doing
[6] => doing fine
[7] => why thank
[8] => thank you
)[/text]
Post Reply