How to write this regex ?

Any questions involving matching text strings to patterns - the pattern is called a "regular expression."

Moderator: General Moderators

Post Reply
Mila
Forum Newbie
Posts: 2
Joined: Thu Oct 07, 2010 7:46 am

How to write this regex ?

Post by Mila »

Hi experts,

I want read a file and generate the following pattern. Please help.

a string=" This is the greatest forum. This is the best one ".

I want a list like this:

1. This is 2
2. is the 2
3. the greatest 1
4. greatest forum 1
5. forum this 1
6.the best 1
7. best one 1


How can I do that with regex?
Thank you very much in advance.
User avatar
twinedev
Forum Regular
Posts: 984
Joined: Tue Sep 28, 2010 11:41 am
Location: Columbus, Ohio

Re: How to write this regex ?

Post by twinedev »

I'm not really sure you can write this with regex. Here is some PHP code that does pretty much the same thing (and i did use preg_replace, so there you have some regex ;-)

Code: Select all

<?php

	$strTest = "This is the greatest forum. This is the best one.";
	$strTest = trim(preg_replace('/[^0-9a-z]+/',' ',strtolower($strTest)));

	$aryWords = explode(' ',$strTest);

	$aryReport = array();

	$intCount = count($aryWords);

	for($t=0;$t<$intCount-1;$t++) {
		$strTwoWord = $aryWords[$t].' '.$aryWords[$t+1];
		if (array_key_exists($strTwoWord,$aryReport)) {
			$aryReport[$strTwoWord]++;
		}
		else {
			$aryReport[$strTwoWord] = 1;
		}
	}

	arsort($aryReport);

	foreach($aryReport as $strKey=>$intCount) {
		echo $strKey,': ',$intCount,"\n";
	}

?>
This produces:
[text]this is: 2
is the: 2
the best: 1
best one: 1
greatest forum: 1
the greatest: 1
forum this: 1[/text]

Now I had it take out any non number/letter characters so it would skip the periods. You may want to adjust this based on your code, knowing what you may expect to receive as input.

-Greg
Mila
Forum Newbie
Posts: 2
Joined: Thu Oct 07, 2010 7:46 am

Re: How to write this regex ?

Post by Mila »

twinedev, thank you very much.

It is a roundabout way of using array. Is there other function that directly uses regex to do the job ?
Post Reply