Easy to beautify PHP code using highlight_string().... not so easy with JavaScript.
Here's a class I've made for this exact purpose... I love it - it's pure class
Get it? Pure cla... ermm... oh never mind
The source it outputs validates to XHTML 1.1 spec and doesn't output <span> where it's not needed (i.e. it's not char-for-char... it does chunks of code
The newest source is always here: http://www.w3style.co.uk/Javascript_Hig ... source.php
Benchmark 1: http://www.w3style.co.uk/Javascript_Hig ... bench1.php
Benchmark 2: http://www.w3style.co.uk/Javascript_Hig ... bench2.php
Benchmark 3: http://www.w3style.co.uk/Javascript_Hig ... bench3.php
2005-07-22: Coding style updated to a C style presentation. Easier to read.
Code: Select all
<?php
/*
JavaScript highlighting class working with PHP4 and PHP5
Written by d11wtq of http://forums.devnetwork.net/
License: GNU General Public License (GPL)
Copyright: Chris Corbyn, Some rights reserved.
Version: 1.0.1, 2005-07-22
*/
class JSHighlight
{
var $source;
var $token_sequence = array();
var $type_sequence = array();
var $converted_sequence = array();
var $htmlColor = '#000000'; //Black
var $defaultColor = '#EE0099'; //Magenta
var $stringColor = '#DD0000'; //Red
var $numberColor = '#226699'; //Turquoise
var $keywordColor = '#007700'; //Green
var $mainObjColor = '#F98844'; //Orange
var $commentColor = '#777777'; //Grey
var $functionColor = '#0000BB'; //Blue
var $methodColor = '#6699EE'; //Light blue
var $objectColor = '#770088'; //Purple
var $symbols = array (
'[', ']',
'(', ')',
'{', '}',
'/',
'*',
'+',
'-',
'%',
'^',
'&',
'@',
'|',
'<',
'>',
'=',
':',
';',
',',
'.',
'?',
'!'
);
var $mainObjs = array (
'window',
'document',
'parent',
'self',
'this',
'top',
'Math'
);
//Yes these ARE case senSiTivE
var $reserved = array (
'if', 'else',
'for',
'while',
'then',
'do',
'in',
'as',
'end',
'break',
'continue',
'return',
'true',
'false',
'new', 'New',
'var',
'array', 'Array',
'image', 'Image',
'object', 'Object',
'string', 'String',
'number', 'Number',
'float', 'Float',
'integer', 'Integer',
'RegExp',
'Layer',
'MOUSEDOWN','MOUSEUP',
'MOUSEMOVE',
'MOUSEOVER','MOUSEOUT',
'KEYDOWN', 'KEYUP',
'KEYPRESS',
'Event',
'function', 'Function'
);
/*
We only need vague token types since we are not
-- going to fully parse this code
*/
var $token_types = array (
'JSH_T_STRING',
'JSH_T_WHITESPACE',
'JSH_T_FUNCTION',
'JSH_T_MAIN_OBJ', //window, document, self, parent
'JSH_T_DEFAULT', //Variables, constants
'JSH_T_NUM', //Floats, integers, decimals, octal, hex
'JSH_T_HTML',
'JSH_T_COMMENT',
'JSH_T_RESERVED_WORD',
'JSH_T_METHOD', //Function in object
'JSH_T_OBJECT',
'JSH_T_SYMBOL', //Having seperate ones is needless (operators etc)
'JSH_T_UNKNOWN' //Usually author code errors
);
function __construct($source)
{
$this->source = $source;
foreach ($this->token_types as $i => $type)
{
$this->define_once($type, $i); //Constants for tokens
} //End foreach
$this->tokenize(); //Break into partially defined chunks
$this->assign_types(); //Completely define each chunk
/*
Store an entity version of the source since we already
-- know the token types
*/
foreach ($this->token_sequence as $i => $token)
{
$this->converted_sequence[$i] = htmlentities($token);
/*
This tab2space() conversion preserves formatting and
-- works better than the one PHP's highlight_ functions use
*/
$this->converted_sequence[$i] = $this->tabs2spaces($this->converted_sequence[$i]);
} //End foreach
} //Construct
//For PHP4
function JSHighlight($source)
{
$this->__construct($source); //Just a loopthrough
} //JSHighlight()
/*
Defines a constant only if it's not defined.
The default is a case insensitive constant boolean
-- TRUE
*/
function define_once($const, $val=true, $c=1)
{
if (!defined($const))
{
define($const, $val, $c);
return true;
}
else
{
return false;
} //End if
} //define_once()
/*
Break the source code into smaller tokens.
-- strtok() is too vague - Regex works much better even
-- if it does looks scary.
*/
function tokenize()
{
/*
A bit about the regex - It would look nicer on multiple lines but then
-- it would fail.
+ Break it apart at each "|" and it's simply a collection of smaller regex
-- in order of preference (all allow a backslash escape character):
+ Single quoted string
+ Double quoted string
+ //comment style comment
+ /* comment * / style comment
+ Regex pattern - will be treated like a string (only supporting /pattern/ for now) - REMOVED!
+ Hexadecimal numbers 0xYZ
+ Symbols (not numbers, letters or underscore)
+ Unquoted letters, numbers or underscores
*/
//This is the BEEF!!
$re = "#(?:(?<!\\\\)\'.*?(?<!\\\\)\')|(?:(?<!\\\\)\".*?(?<!\\\\)\")|(?:(?<!\\\\)//.*?\n)|(?:(?<!\\\\)/\\*.*?\\*/)|0x[a-z0-9]+|\\s+|\\W|\\w+#ism";
//|(?:(?<!\\\\)/(?!\\*)(?-s).+?(?<!\\\\)/(?:[a-z]*))
preg_match_all($re, $this->source, $x);
/*
All the mess we need!
If you do a print_r() of this object you'll see what I mean
-- by "mess"
*/
$this->token_sequence = $x[0];
return true;
} //tokenize()
//Give each token a category
function assign_types()
{
//Pass 1 (define the clear to know types)
foreach ($this->token_sequence as $i => $token)
{
if (preg_match('#^/\*.*?\*/$#s', $token)
|| preg_match('#^//.*$#s', $token))
{
$this->type_sequence[$i] = JSH_T_COMMENT;
}
elseif (preg_match('/^["\'].+$/s', $token))
{
$this->type_sequence[$i] = JSH_T_STRING;
}
elseif (preg_match('/^\s+$/', $token))
{
$this->type_sequence[$i] = JSH_T_WHITESPACE;
}
elseif (preg_match('/^\d+$/', $token)
|| preg_match('/^0x[a-z0-9]+$/i', $token))
{
$this->type_sequence[$i] = JSH_T_NUM;
}
elseif (in_array($token, $this->mainObjs))
{
$this->type_sequence[$i] = JSH_T_MAIN_OBJ;
}
elseif (in_array($token, $this->reserved))
{
$this->type_sequence[$i] = JSH_T_RESERVED_WORD;
}
elseif (in_array($token, $this->symbols))
{
$this->type_sequence[$i] = JSH_T_SYMBOL;
}
elseif (preg_match('/^\w+$/', $token))
{
$this->type_sequence[$i] = JSH_T_DEFAULT;
}
else
{
$this->type_sequence[$i] = JSH_T_UNKNOWN;
} //End if
} //End foreach
//Pass2 (fine tune JSH_T_DEFAULT)
for ($i=0; $i<count($this->type_sequence); $i++) //Using for() so that we can play with $i
{
$type = $this->type_sequence[$i];
if ($type == JSH_T_DEFAULT)
{
if (isset($this->type_sequence[$i-1])
&& isset($this->type_sequence[$i+1])) //This is between two tokens
{
if ($this->token_sequence[$i-1] == '.') //It's part of an object
{
if ($this->token_sequence[$i+1] == '(') //It's a method being called
{
$this->type_sequence[$i] = JSH_T_METHOD;
}
else
{ //No method called so it's just an object
$this->type_sequence[$i] = JSH_T_OBJECT;
} //End if
}
elseif ($this->token_sequence[$i+1] == '(')
{ //Not part of an object followed immmediately by "("
$this->type_sequence[$i] = JSH_T_FUNCTION;
}
elseif ($this->type_sequence[$i+1] == JSH_T_WHITESPACE)
{ //If it's followed by whitespace keep looking for the next token
$x = $i; //Remember position
for ($i=$i+1; $i<count($this->type_sequence); $i++) //Move to next token
{
if ($this->token_sequence[$i] == '(')
{ //Eventually, we see "(" so it's a function
$this->type_sequence[$x] = JSH_T_FUNCTION;
}
elseif ($this->type_sequence[$i] != JSH_T_WHITESPACE)
{ //No longer whitespace but not a function so we'll just leave it JSH_T_DEFAULT
break;
} //End if
} //End for
} //End if
}
elseif (isset($this->type_sequence[$i+1]))
{ //Start of code with more following it
if ($this->token_sequence[$i+1] == '(')
{
$this->type_sequence[$i] = JSH_T_FUNCTION;
} //End if
} //End if
} //End if
} //End for
} //assign_types()
/*
It's not correct to simply str_replace("\t", " ") since tabs read the
-- actual columns in the document. By standard a tab charcter is equivalent
-- to four space characters.
*/
function tabs2spaces($input, $s=4)
{
$lines = explode("\n", $input); //Array of lines
$mod = array();
foreach ($lines as $l)
{
while (false !== $pos = strpos($l, "\t"))
{ //Remember position 0 equates to false
$i = substr($l, 0, $pos);
$t = str_repeat(' ', ($s - $pos % $s)); //Width of the tab
$e = substr($l, $pos+1);
$l = $i.$t.$e; //Rebuild the line
} //End while
$mod[] = $l;
} //End foreach
return str_replace(' ', ' ', implode("\n", $mod));
} // tabs2spaces()
/*
Stick all of the <span> tags in there to make it colorful
-- This is XHTML 1.1 compliant!
*/
function apply_attributes()
{
for ($i=0; $i<count($this->type_sequence); $i++)
{ //Using for() for easier playing with $i
$type = $this->type_sequence[$i];
switch ($type)
{ //The default switch does it's own sub-conditioning
//Apply color
case JSH_T_STRING:
$this->converted_sequence[$i] = '<span style="color:'.$this->stringColor.'">'.
$this->converted_sequence[$i].'</span>';
break;
case JSH_T_FUNCTION:
$this->converted_sequence[$i] = '<span style="color:'.$this->functionColor.'">'.
$this->converted_sequence[$i].'</span>';
break;
case JSH_T_METHOD:
$this->converted_sequence[$i] = '<span style="color:'.$this->methodColor.'">'.
$this->converted_sequence[$i].'</span>';
break;
case JSH_T_MAIN_OBJ:
$this->converted_sequence[$i] = '<span style="color:'.$this->mainObjColor.'">'.
$this->converted_sequence[$i].'</span>';
break;
case JSH_T_OBJECT:
$this->converted_sequence[$i] = '<span style="color:'.$this->objectColor.'">'.
$this->converted_sequence[$i].'</span>';
break;
case JSH_T_NUM:
$this->converted_sequence[$i] = '<span style="color:'.$this->numberColor.'">'.
$this->converted_sequence[$i].'</span>';
break;
case JSH_T_COMMENT:
$this->converted_sequence[$i] = '<span style="color:'.$this->commentColor.'">'.
$this->converted_sequence[$i].'</span>';
break;
default:
/*
It would work without this but the output HTML would be huge since
-- ALL symbols and keywords would have individual <span> tags.
This makes sure they get bunched together. It'll will make browser
-- parsing time faster and reduce bandwidth on large scripts.
*/
if ($type == JSH_T_RESERVED_WORD
|| $type == JSH_T_SYMBOL)
{ //These are the only ones bunching is worth doing on
if (isset($this->type_sequence[$i+1]))
{ //Still something infront to check
/*
We've started the <span> but we can't close it til we know what's ahead!
*/
$this->converted_sequence[$i] = '<span style="color:'.$this->keywordColor.'">'.
$this->converted_sequence[$i];
//$done is a swicth so we know if we were successful doing the bunching
$done = false;
while (isset($this->type_sequence[$i+1]))
{ //Still more code ahead
if ($this->type_sequence[$i+1] == JSH_T_WHITESPACE
|| $this->type_sequence[$i+1] == JSH_T_RESERVED_WORD
|| $this->type_sequence[$i+1] == JSH_T_SYMBOL)
{ //Found space, or more symbols... there could be more so we continue
$i++;
continue;
}
else
{ //No more space or symbols... this is the last one so close the <span> here
$this->converted_sequence[$i] .= '</span>';
$done = true;
break;
} //End if
} //End while
if (!$done)
{ //We didn't manage to bunch anything but <span> is unclosed... close it now
$this->converted_sequence[$i] .= '</span>';
} //End if
}
else
{ //Nothing else ahead anyway so just close now
$this->converted_sequence[$i] = '<span style="color:'.$this->keywordColor.'">'.
$this->converted_sequence[$i].'</span>';
} //End if
} //End if
break;
} //End switch
} //End foreach
} //apply_attributes()
/*
Setting the optional parameter to true causes Generate() to
-- only return the data rather than output it to the page
*/
function Generate($ret=false)
{
$this->apply_attributes(); //Beautify
/*
Putting the default color <span> tags around the entire output
-- catches anything not colored.
*/
if (!$ret)
{ //Print to page
echo '<span style="color:'.$this->defaultColor.'">'.
nl2br(implode('', $this->converted_sequence)).'</span>';
return true;
}
else
{ //Return as var
return '<span style="color:'.$this->defaultColor.'">'.
nl2br(implode('', $this->converted_sequence)).'</span>';
} //End if
} //output()
/*
Just some customizations that may optionally be called
-- before outputting
*/
function SetHTMLColor($c)
{
if ($this->htmlColor = $c)
{
return true;
}
else
{
return false;
} //End if
} //setHTMLColor()
function SetDefaultColor($c)
{
if ($this->defaultColor = $c)
{
return true;
}
else
{
return false;
} //End if
} //setDefaultColor()
function SetKeywordColor($c)
{
if ($this->keywordColor = $c)
{
return true;
}
else
{
return false;
} //End if
} //setKeywordColor()
function SetStringColor($c)
{
if ($this->stringColor = $c)
{
return true;
}
else
{
return false;
} //End if
} //setStringColor()
function SetObjectColor($c)
{
if ($this->objectColor = $c)
{
return true;
}
else
{
return false;
} //End if
} //setObjectColor()
function SetMainObjColor($c)
{
if ($this->mainObjColor = $c)
{
return true;
}
else
{
return false;
} //End if
} //setObjectColor()
function SetNumberColor($c)
{
if ($this->numberColor = $c)
{
return true;
}
else
{
return false;
} //End if
} //setNumberColor()
function SetCommentColor($c)
{
if ($this->commentColor = $c)
{
return true;
}
else
{
return false;
} //End if
} //setCommentColor()
function SetMethodColor($c)
{
if ($this->methodColor = $c)
{
return true;
}
else
{
return false;
} //End if
} //setMethodColor()
function __destruct()
{
//
} //Destruct
}
?>Calling Generate(true); stop it outputting code and returns it as a variable.
If you don't like the manky colors I chose then either:
a) Change them at the top of the class in the code itself
b) Call the methods $JSHighlight->SetCommentColor('anything_valid_in_html') BEFORE calling Generate()...
Also, the array up at the top (hey, all the setup is at the top) xconatin the keywords etc it looks for... modify that to your liking. I might post a more concise version in a day or two anyway
I'll make some documentation for it though, I'm just to eager to let people use it
Example:
Code: Select all
$data = file_get_contents('some_file.js');
$JSHighlight = new JSHighlight($data);
$JSHighlight->Generate();
One thing to note at this point.... if HTML is included most of it will go Magenta cos it will just try and parse it... I'm working on allowing HTML in it now (but for my needs this was enough)