Page 1 of 1

Translating pseudocode to PHP

Posted: Wed Aug 31, 2005 6:20 am
by clevie
Hi

I have been trying to translate a java pseudocode into a PHP class. The pseudocode code is as follows:

Code: Select all

Greedy-String-Tiling(String A, String) 
{
tiles = {};
do
{
maxmatch = MinimumMatchLength;
matches = {};
Forall unmarked tokens Aa in A
{ 
Forall unmarked tokens Bb in B
{ 
j= 0;
while (Aa+j == Bb+j && unmarked(Aa+j ) && unmarked(Bb+j ))
j ++;
if (j == maxmatch)
matches = matches + match(a, b, j);
else if (j > maxmatch)
{
matches = {match(a; b; j)};
maxmatch = j;
}
} 
}

Forall match(a; b; maxmatch) Σ matches
{
For j = 0… (maxmatch - 1)
{ 
mark(Aa+j );
mark(Bb+j );
} 
tiles = tiles U match(a; b; maxmatch);
} 
}while (maxmatch > MinimumMatchLength);
return tiles;
}
As I am fairly new to PHP I have had difficulty with this. Could anyone please steer me in th eright direction....

Clevie.....

Posted: Wed Aug 31, 2005 7:07 am
by Chris Corbyn
Telling is what it's supposed to be for may be a help :)

Posted: Wed Aug 31, 2005 10:13 am
by clevie
it's for an algorithm called Greedy String Tiling (GST), used to compare two files. at the moment its only implemented in java and c and I need a php version for my project which I am doing for my university course.

there is also another variation of the algorithm called Running Karp-Rabin Greedy String Tiling (RKR-GST), which I also have the pseudocode for. so by getting this one done it would be easier for me to do the rkr-gst one.

thanks for your help

Posted: Wed Aug 31, 2005 3:41 pm
by Ambush Commander
Hmm... you need to be introduced to the syntax of php. I suggest you read... the manual. There's nothing extremely strange in this example, so with a basic knowledge, you should be able to figure it out.

Here's pointers:
1. viewtopic.php?t=21400
2. Dashes aren't allowed in function names in PHP
3. Assuming that {} means a new associative array, the equivalent is $tiles = array();. In PHP arrays and associative arrays are the same, I believe Java makes associative arrays the same as objects (although this might be JavaScript I'm talking about, if so, ignore me.)
4. Indenting would be a good idea. ;)
5. I don't know what the … is about…

Posted: Wed Aug 31, 2005 4:16 pm
by feyd
Ambush Commander wrote:Hmm... you need to be introduced to the syntax of php. I suggest you read... the manual. There's nothing extremely strange in this example, so with a basic knowledge, you should be able to figure it out.

Here's pointers:
1. viewtopic.php?t=21400
2. Dashes aren't allowed in function names in PHP
3. Assuming that {} means a new associative array, the equivalent is $tiles = array();. In PHP arrays and associative arrays are the same, I believe Java makes associative arrays the same as objects (although this might be JavaScript I'm talking about, if so, ignore me.)
4. Indenting would be a good idea. ;)
5. I don't know what the … is about…
to add to/clarify a bit:

2. Dashes are also illegal in all identifiers in PHP, including function names, variable names, class names, etc.
3. {} is indeed a shortcut to an array, so $tiles = array(); is the equivalent.
4. most definitely good :)
5. the ellipsis in this case is 'to' so that line would go:

Code: Select all

for($j = 0; $j < ($maxmatch - 1); $j++)

Posted: Thu Sep 01, 2005 6:05 am
by clevie
Thanks for the help guys, I really appreciate it. I tried it out for myself again and cam up with:

Code: Select all

<?php
class Greedy-String-Tiling($stringA, $stringB)
{
	$tiles = array();
	do
	{
		$maxMatch = $minimumMatchLength;
		$matches = array();
		for()
		{
			for()
			{
				$j = 0;
				while()
					$j++;
				if($j == $maxMatch)
					$matches = ;
				elseif($j > $maxMatch)
				{
					$matches = ;
					$maxmatch = $j;
				}
			}
		}

		for()
		{
			for($j = 0; $j < ($maxMatch - 1); $j++)
			{
				
			}
			$tiles = $tiles
		}
	}while($maxMatch > $minimumMatchLength);
	return $tiles;
}
?>
The parts that are missing are the most difficult parts for me: Forall unmarked tokens Aa in A. Do I use for() or foreach()?

The line matches = matches + match(a,b,j) is suppose to add the matches to a set of matches, but I'm not sure how to do that.

Bless
Clevie

Posted: Thu Sep 01, 2005 9:20 am
by nielsene
clevie wrote: The parts that are missing are the most difficult parts for me: Forall unmarked tokens Aa in A. Do I use for() or foreach()?
What should "Forall unmarked tokens" do? Does it loop over each "word" in the string, each letter? etc.
The line matches = matches + match(a,b,j) is suppose to add the matches to a set of matches, but I'm not sure how to do that.

Code: Select all

$matches[]=$something;
Well add $something to the end of the $matches array. Once I understand the two forall loops, I'll know what $something should be.

Posted: Thu Sep 01, 2005 10:45 am
by Chris Corbyn
On a side-note your class name "Greedy-String-Tiling" is invalid.. You can't use dashes in class names, variable names, constants, function names etc in PHP.... use underscores instead or remove them altogether ;)

Posted: Thu Sep 01, 2005 2:27 pm
by clevie
I should have explained what I wanted to first. Basically I'm doing a project for my university to develop an application/system using PHP that would compare two php files and find any similarities between them. there are two main classes needed for this system:

1. a tokeniser class, that when the files are selected would generate a set of tokens similar to the following:

Java source code Generated tokens
1 public class Count { BEGINCLASS
2 public static void main(String[] args) VARDEF,BEGINMETHOD
3 throws java.io.IOException {
4 int count = 0; VARDEF,ASSIGN
5
6 while (System.in.read() != -1) APPLY,BEGINWHILE
7 count++; ASSIGN,ENDWHILE
8 System.out.println(count+" chars."); APPLY
9 } ENDMETHOD
10 } ENDCLASS

But the tokens would be for a PHP class instead of a Java

2. Then the tokens are passed to the second class, which will be the greedy string tiling (gst) alogorithm implemented in PHP to check the tokens.

The the gst has two parts: scanpattern and markarrays. the original pseudocode for it is as follows:

Scanpattern
length_of_token_tiled := 0
Repeat
maxmatch := minimum-match-length
starting at the first unmarked token of P, for each Pp do
starting at the first unmarked token of T, for each Tt do
j := 0;
while Pp+j = Tt+j AND unmarked(Pp+j) AND unmarked(Tt+j) do
j := j + 1
if j = maxmatch then add match(p, t, j) to list of matches of length j
else if j > maxmatch then start new list with match(p, t, j) and maxmatch := j

Markarrays
for each match(p, t, maxmatch) in list
if not oclluded then
for j:=0 to maxmatch – 1 do
mark_token(Pp+j)
mark_token(Tt+j)
length_of_tokens_tiled := length_of_tokens_tiled + maxmatch
Until maxmatch = minimum-match-length

Which gives the Java pseudocode:

Code: Select all

Greedy-String-Tiling(String A, String) 
{ 
tiles = {}; 
do 
{ 
maxmatch = MinimumMatchLength; 
matches = {}; 
Forall unmarked tokens Aa in A 
{ 
Forall unmarked tokens Bb in B 
{ 
j= 0; 
while (Aa+j == Bb+j && unmarked(Aa+j ) && unmarked(Bb+j )) 
j ++; 
if (j == maxmatch) 
matches = matches + match(a, b, j); 
else if (j > maxmatch) 
{ 
matches = {match(a; b; j)}; 
maxmatch = j; 
} 
} 
} 

Forall match(a; b; maxmatch) &#931; matches 
{ 
For j = 0… (maxmatch - 1) 
{ 
mark(Aa+j ); 
mark(Bb+j ); 
} 
tiles = tiles U match(a; b; maxmatch); 
} 
}while (maxmatch > MinimumMatchLength); 
return tiles; 
}
A & B I think are suppose to be arrays and a & b are the tokens (substrings in the arrays) so the forall Aa in A should mean the tokens in the array of A. Looking at it now I think I need to assign a variable $a to mean the contents of array A.

Hope this shed some more light as to what I am doing.

By the way anyone know how to create a tokeniser set for PHP?

Clevie

Posted: Mon Sep 05, 2005 5:27 am
by clevie
How easy is it to transform code written in Perl to PHP???

Posted: Mon Sep 05, 2005 5:42 am
by s.dot
clevie wrote:How easy is it to transform code written in Perl to PHP???
I imagine that would depend greatly on what you know/how well you know both languages :P

Posted: Mon Sep 05, 2005 9:35 am
by Ambush Commander
Fairly simple, though. You have to know both, but Perl is remarkably similar to PHP in some aspects (you'll generally be able to preserve the flow of the code).