Page 1 of 2
I'm Clueless... how to obfuscate this php script...
Posted: Wed Feb 15, 2006 3:50 pm
by jclarkkent2003
Hi guys,
I'm clueless on this.
I have a php script I need to objustate but not encode/encrypt.
I want to REMOVE ALL COMMENTS and also I want to RENAME ALL VARIABLES to like $_var1, $_var2, $_var3 , etc... instead of what they are now.
I made my script 100% readable and I don't want to give it out this easily readable, I have WAY Too much information in the comments and variable names are WAY too self explanitory, I prefer to give it out unencoded and whenever I update, they come to me for the update instead of taking a part my script. I do not want to encode/encrypt this script at this time.
So how would I do these two things TO A php script?
1. REMOVE ALL COMMENTS
2. RENAME ALL VARIABLES to like $_var1, $_var2, $_var3
Thanks, I'm really clueless how to do this.
Posted: Wed Feb 15, 2006 4:10 pm
by feyd
write a script that uses php's tokenizer, which is built to parse PHP files.
http://php.net/tokenizer
Posted: Wed Feb 15, 2006 4:27 pm
by jclarkkent2003
Code: Select all
<?php
/*
* T_ML_COMMENT does not exist in PHP 5.
* The following three lines define it in order to
* preserve backwards compatibility.
*
* The next two lines define the PHP 5 only T_DOC_COMMENT,
* which we will mask as T_ML_COMMENT for PHP 4.
*/
if (!defined('T_ML_COMMENT')) {
define('T_ML_COMMENT', T_COMMENT);
} else {
define('T_DOC_COMMENT', T_ML_COMMENT);
}
$source = file_get_contents('example.php');
$tokens = token_get_all($source);
foreach ($tokens as $token) {
if (is_string($token)) {
// simple 1-character token
echo $token;
} else {
// token array
list($id, $text) = $token;
switch ($id) {
case T_COMMENT:
case T_ML_COMMENT: // we've defined this
case T_DOC_COMMENT: // and this
// no action on comments
break;
default:
// anything else -> output "as is"
echo $text;
break;
}
}
}
?>
ok I took their code straight up, and all it does is displays my php script AS normal text.
I guess what I was asking, is I am GOOD with regular expresions, preg_match_all , etc, but I'm not a god with them.
I Don't know how to make a preg_match_all & preg_replace rule that will find EACH occurance of a VARIABLE, or say a WORD, and replace it with a new word.
OK here is what I have:
Code: Select all
preg_replace("/\$[a-zA-z0-9_]+/is","$var".num,$source);
FIRST, how to I make num INCREMENT each time it finds a NEW match.
Did you get the syntax of that code? I am trying to find anything that starts with $ and has [a-zA-z0-9_] directly after it, and then I need to know how to REPLACE each MATCHING occurance ....
like example:
Code: Select all
<?
$name = "J C";
$address = "123 Fake Street";
$phone = "123-456-7890";
// whoops I change addreses' here:
$address = "456 Fake Street";
?>
and I need the regular expresssion ( probably ) echo something like this:
Code: Select all
<?
$var1 = "J C";
$var2 = "123 Fake Street";
$var3 = "123-456-7890";
// whoops I change addreses' here:
$var2 = "456 Fake Street";
?>
NOTE the $address variable is VAR 2 in ALL occurances. and most importantly, there are 100 variables names, I want to use regular expressions to match up with them, not hard code each variable name in, I could use str_replace("$address","$var2") but that would require 100 HARD CODED lines, one for each variable....
Any more help you can give?
Thanks.
Posted: Wed Feb 15, 2006 4:31 pm
by josh
You will want to keep an array registry mapping of variables you have already seen, when you come across a new variable you create a new number for it (use the array's numeric index for example), when you come across a variable you have seen before just map it to it's number by looking at that array. You could also remove newlines and whitespace outside of quotes (so your script would be basically one line and all stuck together). Really if you are that big on hiding your source you might be better off looking at a compiled language but even those can be reversed easily (its just a tad more messy then un-obfuscating PHP script)
Posted: Wed Feb 15, 2006 4:40 pm
by jclarkkent2003
the 1-2 people I am going to share my script with, I trust them. and I know they wont' share it with anyone else, so I want to give unencoded/not encrypted, but I just don't want to give it as is, because right now it is Commented LOADS of places, and the variable names are WAY explanatory, they will not need me to help code for them anymore, rather they can just take my script and learn php right off of it, and edit it as they please...
you get what i'm going at?
how would I keep a array mapping of ones i've already seen....
Code: Select all
$source = entire file
preg_match_all("/(\$[a-zA-Z0-9]+)/is"$source, $results);
array_unique($results[1]);
ok, that so far gets all the variable names into a nice little array correct, and removes all duplictes?
now how would I preg_replace pulling variables from an array? Toss it in a for loop ? ..
/* Guessing
for($k=0;$k<sizeof($results[1]);$k++)
{
preg_replace("/".$results[$k][1]."/is","$var".$k,$source);
}
*/
God please tell me if that's at least half right or something close, lol, not bad for 3 minutes of coding.... I appreciate you guys checking my stuff, it really gives me confidence and i'm becoming quite a strong coder on my own, I already thought about the array idea you said jshpro2 but didn't think that was the most efficient way so I asked.
I have not tested the above, it's just my rough 3 minute coding job while typing but does it look right?
Posted: Wed Feb 15, 2006 4:45 pm
by RobertGonzalez
How much code are you talking about? Is this something that can be easily FIND->REPLACED in a text editor?
Posted: Wed Feb 15, 2006 4:45 pm
by feyd
you shouldn't need regex at all... the tokenizer does all the work for you, you just need to handle the tokens it passes to you.
Posted: Wed Feb 15, 2006 4:50 pm
by jclarkkent2003
script is 16 kb about, and it's not 100 vars probably, it may be upto 300 vars MAX, heh, i honestly don't have a clue to the number of vars, i just said a number to make it important it's not hard coded.....
SAY there are 150 DIFFERENT variables names.
I want to name them from
$var1 - $var150
Right? That's what my goal is, I don't want to leave them as they are named now, and I REALLY don't want to have to hard code EACH and EVERY variable name into the variable name cleaner script ( that's what i'll call thsi script ).
I'd like to be able to reuse this script on another few scripts, so all scripts I distribute are now $var1-$var150 , instead of $name, $address, $phone, etc.... Those aren't the real var namse as the REAL var names are EXTREMELY long and detailed, lol, imaing $php_variable_to_loop_function_c_x_times, and having ALL my variables currently named THAT specific.
Posted: Wed Feb 15, 2006 4:52 pm
by jclarkkent2003
feyd wrote:you shouldn't need regex at all... the tokenizer does all the work for you, you just need to handle the tokens it passes to you.
no clue how to do that mate, tokenizer didn't have much documentation for me to learn from.
TWo REAL tokens:
$phrase
$back_2_array
If you can tell me how I can do for those two tokens, why wouldn't it work for all tokens? you know.... or i can look and apply to others, as long as i don't have to list ALL 100-300 variable names, lol, that's hard coding.
Posted: Wed Feb 15, 2006 5:05 pm
by feyd
all of 3 minutes:
Code: Select all
<?php
/*
* T_ML_COMMENT does not exist in PHP 5.
* The following three lines define it in order to
* preserve backwards compatibility.
*
* The next two lines define the PHP 5 only T_DOC_COMMENT,
* which we will mask as T_ML_COMMENT for PHP 4.
*/
if (!defined('T_ML_COMMENT')) {
define('T_ML_COMMENT', T_COMMENT);
} else {
define('T_DOC_COMMENT', T_ML_COMMENT);
}
$source = file_get_contents(__FILE__);
$tokens = token_get_all($source);
$variables = array();
foreach ($tokens as $token) {
if (is_string($token)) {
// simple 1-character token
echo $token;
} else {
// token array
list($id, $text) = $token;
switch ($id) {
case T_COMMENT:
case T_ML_COMMENT: // we've defined this
case T_DOC_COMMENT: // and this
// no action on comments
break;
case T_VARIABLE:
$index = array_search($text,$variables);
if($index === false)
{
$variables[] = $text;
$index = count($variables)-1;
}
echo '$_var'.$index;
break;
default:
// anything else -> output "as is"
echo $text;
break;
}
}
}
?>
output
Code: Select all
<?php
if (!defined('T_ML_COMMENT')) {
define('T_ML_COMMENT', T_COMMENT);
} else {
define('T_DOC_COMMENT', T_ML_COMMENT);
}
$_var0 = file_get_contents(__FILE__);
$_var1 = token_get_all($_var0);
$_var2 = array();
foreach ($_var1 as $_var3) {
if (is_string($_var3)) {
echo $_var3;
} else {
list($_var4, $_var5) = $_var3;
switch ($_var4) {
case T_COMMENT:
case T_ML_COMMENT:
case T_DOC_COMMENT:
break;
case T_VARIABLE:
$_var6 = array_search($_var5,$_var2);
if($_var6 === false)
{
$_var2[] = $_var5;
$_var6 = count($_var2)-1;
}
echo '$_var'.$_var6;
break;
default:
echo $_var5;
break;
}
}
}
?>
Posted: Wed Feb 15, 2006 5:20 pm
by jclarkkent2003
bingo, bango, bongo, thank you
I FINALLY see how to use tokenlizer, lol, <-- stupid....
Posted: Wed Feb 15, 2006 5:25 pm
by jclarkkent2003
jshpro2 wrote:You will want to keep an array registry mapping of variables you have already seen, when you come across a new variable you create a new number for it (use the array's numeric index for example), when you come across a variable you have seen before just map it to it's number by looking at that array. You could also remove newlines and whitespace outside of quotes (so your script would be basically one line and all stuck together). Really if you are that big on hiding your source you might be better off looking at a compiled language but even those can be reversed easily (its just a tad more messy then un-obfuscating PHP script)
Would you give me an idea how to strip WHITESPACE from between PHP code and not from between echo'ed text?
I know how to strip the end lines just preg match any \r or \n and replace with nothing, but how would one do whitespace between php ?heh ... preg_replace("/[^"\'](.*?)\s+/is","" lol, you know what, i can't do that one off the top of my head.... But that would be nice to do..
What did you mean compiled language? are you talking about zend encrypted or something like a c script compiled on a linux server into binary? I'd like to keep this script php for now as I still have not caught up on my C programming skillz.
I'm gonna give another 10 mins to whitespace outside of quotes ands ee whaat i can do.
Posted: Wed Feb 15, 2006 5:33 pm
by jclarkkent2003
Code: Select all
$file = preg_replace("/[\r\n]+/i","",$file);
$file = preg_replace("/[\s\t]+/is","",$file);
IS all I am coming up with right now, that strips ALL spaces tho, I'm not quite sure how to limit NOT to inside quotes. Is there something to assist? lol, i don't know what magic quotes are so i don't know if that has to do with anything.
Posted: Wed Feb 15, 2006 5:37 pm
by jclarkkent2003
Code: Select all
echo php_strip_whitespace(__FILE__);
LoL.... the easistest things escape me completely now don't they?
Everyone have a good laugh at me

Posted: Wed Feb 15, 2006 5:40 pm
by Buddha443556
jclarkkent2003 wrote:Would you give me an idea how to strip WHITESPACE from between PHP code and not from between echo'ed text?
Just in case you don't have PHP5 white space is just another token: T_WHITESPACE
Just like T_COMMENT. <hint>