Page 1 of 1

get all between <? ?> ignoring "<? ?>"

Posted: Fri May 19, 2006 3:51 am
by remco-v
I know im a nooby at regex but i so wanne learn this great thing.

How can i get all the contents between 2 <? ?> marks

i got this far

Code: Select all

<?php
$string = "<td valign='top' class='NavigationBackground'><?translate('body?>')?> <?run('Navigation','__default')?></td>";

// Reads the string and finds all php calls!
function extractTexts($contents){
	preg_match_all("/<\?php(.*?)\?>|<\?(.*?)\?>/i",$contents,$matches);
	// Merge <?php and <? matches
	$matches = array_merge($matches[1],$matches[2]);
	foreach ($matches as $key => $phpCode) {
		echo $phpCode."\n";
				
	}
}
extractTexts($string);
?>
it echo's

Code: Select all

translate('body
run('Navigation','__default')
whitch is wrong.
i just dont know how to tell the regex not to match if the ?> end sign is between ' ' or ""

Posted: Fri May 19, 2006 6:22 am
by jmut
Try this one.

Code: Select all

preg_match_all("/<\?php(.*?)\?>|<\?(.*?)\?>(?!['\"])/i",$contents,$matches);

Look for "regular expression negative look forward" in google

Posted: Fri May 19, 2006 8:29 am
by remco-v
Nop that does not do the trick,

I did find the look forward thing like an hour ago but i cant get it to work;)

Posted: Fri May 19, 2006 8:38 am
by jmut
remco-v wrote:Nop that does not do the trick,

I did find the look forward thing like an hour ago but i cant get it to work;)
why not.
It returns

Code: Select all

translate('body?>')
run('Navigation','__default')
is it not what you look for?!?!

Posted: Fri May 19, 2006 8:50 am
by remco-v
Darn these expressions are messing with my brain.
Now it is working tx a lot man..

I still dont understand what is going on though.
And i still have a lot of stuff to do with them. Step 2 is to check whitch functions are called.....
Do you perhaps have a link to a good tutorial?

Ive looked at some and cant get around the advanced part!

Posted: Sun May 21, 2006 2:17 pm
by jmut
About catching function calls (I assume you mean in php) your best way to go is parsing php EBNF grammer(php tokens)
There is package in PEAR that could do this for you.

http://pear.php.net/package/PHP_Parser

although it is still dev version it is pretty good and can give you and idea how to do it (if not already done)




As for:
remco-v wrote: ...I still dont understand what is going on though...
I made expression from this

"/<\?php(.*?)\?>|<\?(.*?)\?>/i"
to this
"/<\?php(.*?)\?>|<\?(.*?)\?>(?!['"])/i"



This means look after "?>" (in our case) so we dont have (?!) ['"] --> quote or double quote.


Hence, negative look ahead expression
Happy coding.