d11wtq and His Amazing Technicolor JavaScript

Small, short code snippets that other people may find useful. Do you have a good regex that you would like to share? Share it! Even better, the code can be commented on, and improved.

Moderator: General Moderators

Post Reply
User avatar
Chris Corbyn
Breakbeat Nuttzer
Posts: 13098
Joined: Wed Mar 24, 2004 7:57 am
Location: Melbourne, Australia

d11wtq and His Amazing Technicolor JavaScript

Post by Chris Corbyn »

Got one of those websites where you show off your code?
Easy to beautify PHP code using highlight_string().... not so easy with JavaScript.

Here's a class I've made for this exact purpose... I love it - it's pure class :lol:

Get it? Pure cla... ermm... oh never mind :oops: *cough*

The source it outputs validates to XHTML 1.1 spec and doesn't output <span> where it's not needed (i.e. it's not char-for-char... it does chunks of code ;))

The newest source is always here: http://www.w3style.co.uk/Javascript_Hig ... source.php

Benchmark 1: http://www.w3style.co.uk/Javascript_Hig ... bench1.php
Benchmark 2: http://www.w3style.co.uk/Javascript_Hig ... bench2.php
Benchmark 3: http://www.w3style.co.uk/Javascript_Hig ... bench3.php

2005-07-22: Coding style updated to a C style presentation. Easier to read.

Code: Select all

<?php

/*
 JavaScript highlighting class working with PHP4 and PHP5
 Written by d11wtq of http://forums.devnetwork.net/
 
 License:   GNU General Public License (GPL)
 Copyright: Chris Corbyn, Some rights reserved.
 Version:   1.0.1, 2005-07-22
 */

class JSHighlight
{

    var $source;
    var $token_sequence = array();
    var $type_sequence = array();
    var $converted_sequence = array();
    
    var $htmlColor = '#000000';     //Black
    var $defaultColor = '#EE0099';  //Magenta
    var $stringColor = '#DD0000';   //Red
    var $numberColor = '#226699';   //Turquoise
    var $keywordColor = '#007700';  //Green
    var $mainObjColor = '#F98844';  //Orange
    var $commentColor = '#777777';  //Grey
    var $functionColor = '#0000BB'; //Blue
    var $methodColor = '#6699EE';   //Light blue
    var $objectColor = '#770088';   //Purple
    
    var $symbols = array (
        '[', ']',
        '(', ')',
        '{', '}',
        '/',
        '*',
        '+',
        '-',
        '%',
        '^',
        '&',
        '@',
        '|',
        '<',
        '>',
        '=',
        ':',
        ';',
        ',',
        '.',
        '?',
        '!'
    );
    
    var $mainObjs = array (
        'window',
        'document',
        'parent',
        'self',
        'this',
        'top',
        'Math'
    );
    
    //Yes these ARE case senSiTivE
    var $reserved = array (
        'if',       'else',
        'for',
        'while',
        'then',
        'do',
        'in',
        'as',
        'end',
        'break',
        'continue',
        'return',
        'true',
        'false',
        'new',      'New',
        'var',
        'array',    'Array',
        'image',    'Image',
        'object',   'Object',
        'string',   'String',
        'number',   'Number',
        'float',    'Float',
        'integer',  'Integer',
        'RegExp',
        'Layer',
        'MOUSEDOWN','MOUSEUP',
        'MOUSEMOVE',
        'MOUSEOVER','MOUSEOUT',
        'KEYDOWN',  'KEYUP',
        'KEYPRESS',
        'Event',
        'function', 'Function'
    );
    
    /*
     We only need vague token types since we are not
     -- going to fully parse this code
     */
     
    var $token_types = array (
        'JSH_T_STRING',
        'JSH_T_WHITESPACE',
        'JSH_T_FUNCTION',
        'JSH_T_MAIN_OBJ',   //window, document, self, parent
        'JSH_T_DEFAULT',    //Variables, constants
        'JSH_T_NUM',        //Floats, integers, decimals, octal, hex
        'JSH_T_HTML',
        'JSH_T_COMMENT',
        'JSH_T_RESERVED_WORD',
        'JSH_T_METHOD',     //Function in object
        'JSH_T_OBJECT',
        'JSH_T_SYMBOL',     //Having seperate ones is needless (operators etc)
        'JSH_T_UNKNOWN'     //Usually author code errors
    );

    function __construct($source)
    {
        $this->source = $source;
        foreach ($this->token_types as $i => $type)
        {
            $this->define_once($type, $i); //Constants for tokens
        } //End foreach
        
        $this->tokenize(); //Break into partially defined chunks
        $this->assign_types(); //Completely define each chunk
        
        /*
         Store an entity version of the source since we already
         -- know the token types
         */
        foreach ($this->token_sequence as $i => $token)
        {
            
            $this->converted_sequence[$i] = htmlentities($token);
            /*
             This tab2space() conversion preserves formatting and
             -- works better than the one PHP's highlight_ functions use
             */
            $this->converted_sequence[$i] = $this->tabs2spaces($this->converted_sequence[$i]);
                
        } //End foreach
    
    } //Construct
    
    //For PHP4
    function JSHighlight($source)
    {
    
        $this->__construct($source); //Just a loopthrough
            
    } //JSHighlight()
    
    /*
     Defines a constant only if it's not defined.
     The default is a case insensitive constant boolean
     -- TRUE
     */
    function define_once($const, $val=true, $c=1)
    {
    
        if (!defined($const))
        {
            define($const, $val, $c);
            return true;
        }
        else
        {
            return false;
        } //End if
    
    } //define_once()
    
    /*
     Break the source code into smaller tokens.
     -- strtok() is too vague - Regex works much better even
     -- if it does looks scary.
     */
     
    function tokenize()
    {
    
        /*
         A bit about the regex - It would look nicer on multiple lines but then
         -- it would fail.
         + Break it apart at each "|" and it's simply a collection of smaller regex
           -- in order of preference (all allow a backslash escape character):
          + Single quoted string
          + Double quoted string
          + //comment style comment
          + /* comment * / style comment
          + Regex pattern - will be treated like a string (only supporting /pattern/ for now) - REMOVED!
          + Hexadecimal numbers 0xYZ
          + Symbols (not numbers, letters or underscore)
          + Unquoted letters, numbers or underscores
         */
        
        //This is the BEEF!!
        $re = "#(?:(?<!\\\\)\'.*?(?<!\\\\)\')|(?:(?<!\\\\)\".*?(?<!\\\\)\")|(?:(?<!\\\\)//.*?\n)|(?:(?<!\\\\)/\\*.*?\\*/)|0x[a-z0-9]+|\\s+|\\W|\\w+#ism";
        //|(?:(?<!\\\\)/(?!\\*)(?-s).+?(?<!\\\\)/(?:[a-z]*))
        
        preg_match_all($re, $this->source, $x);
        /*
         All the mess we need!
         If you do a print_r() of this object you'll see what I mean
         -- by "mess" 
         */
        $this->token_sequence = $x[0];
        
        return true;
    
    } //tokenize()
    
    //Give each token a category
    function assign_types()
    {
    
        //Pass 1 (define the clear to know types)
        foreach ($this->token_sequence as $i => $token)
        {
        
            if (preg_match('#^/\*.*?\*/$#s', $token)
                || preg_match('#^//.*$#s', $token))
            {
                $this->type_sequence[$i] = JSH_T_COMMENT;
            }
            elseif (preg_match('/^["\'].+$/s', $token))
            {
                $this->type_sequence[$i] = JSH_T_STRING;
            }
            elseif (preg_match('/^\s+$/', $token))
            {
                $this->type_sequence[$i] = JSH_T_WHITESPACE;
            }
            elseif (preg_match('/^\d+$/', $token)
                || preg_match('/^0x[a-z0-9]+$/i', $token))
            {
                $this->type_sequence[$i] = JSH_T_NUM;
            }
            elseif (in_array($token, $this->mainObjs))
            {
                $this->type_sequence[$i] = JSH_T_MAIN_OBJ;
            }
            elseif (in_array($token, $this->reserved))
            {
                $this->type_sequence[$i] = JSH_T_RESERVED_WORD;
            }
            elseif (in_array($token, $this->symbols))
            {
                $this->type_sequence[$i] = JSH_T_SYMBOL;
            }
            elseif (preg_match('/^\w+$/', $token))
            {
                $this->type_sequence[$i] = JSH_T_DEFAULT;
            }
            else
            {
                $this->type_sequence[$i] = JSH_T_UNKNOWN;
            } //End if
            
        } //End foreach
        
        //Pass2 (fine tune JSH_T_DEFAULT)
        for ($i=0; $i<count($this->type_sequence); $i++) //Using for() so that we can play with $i
        {
        
            $type = $this->type_sequence[$i];
            if ($type == JSH_T_DEFAULT)
            {
                
                if (isset($this->type_sequence[$i-1])
                    && isset($this->type_sequence[$i+1])) //This is between two tokens
                {
                    
                    if ($this->token_sequence[$i-1] == '.') //It's part of an object
                    {
                        if ($this->token_sequence[$i+1] == '(') //It's a method being called
                        {
                            $this->type_sequence[$i] = JSH_T_METHOD;
                        }
                        else
                        { //No method called so it's just an object
                            $this->type_sequence[$i] = JSH_T_OBJECT;
                        } //End if
                    }
                    elseif ($this->token_sequence[$i+1] == '(')
                    { //Not part of an object followed immmediately by "("
                        $this->type_sequence[$i] = JSH_T_FUNCTION;
                    }
                    elseif ($this->type_sequence[$i+1] == JSH_T_WHITESPACE)
                    { //If it's followed by whitespace keep looking for the next token
                    
                        $x = $i; //Remember position
                        for ($i=$i+1; $i<count($this->type_sequence); $i++) //Move to next token
                        {
                            if ($this->token_sequence[$i] == '(')
                            { //Eventually, we see "(" so it's a function
                                $this->type_sequence[$x] = JSH_T_FUNCTION;
                            }
                            elseif ($this->type_sequence[$i] != JSH_T_WHITESPACE)
                            { //No longer whitespace but not a function so we'll just leave it JSH_T_DEFAULT
                                break;
                            } //End if
                        } //End for
                        
                    } //End if
                }
                elseif (isset($this->type_sequence[$i+1]))
                { //Start of code with more following it
                    if ($this->token_sequence[$i+1] == '(')
                    {
                        $this->type_sequence[$i] = JSH_T_FUNCTION;
                    } //End if
                } //End if
                
            } //End if
        
        } //End for
    
    } //assign_types()
    
    /*
     It's not correct to simply str_replace("\t", "    ") since tabs read the
     -- actual columns in the document. By standard a tab charcter is equivalent
     -- to four space characters.
     */
    
    function tabs2spaces($input, $s=4)
    {
        
        $lines = explode("\n", $input); //Array of lines
        $mod = array();
        
        foreach ($lines as $l)
        {
            
            while (false !== $pos = strpos($l, "\t"))
            { //Remember position 0 equates to false
                
                $i = substr($l, 0, $pos);
                $t = str_repeat('&nbsp;', ($s - $pos % $s)); //Width of the tab
                $e = substr($l, $pos+1);
                $l = $i.$t.$e; //Rebuild the line
                
            } //End while
            
            $mod[] = $l;
            
        } //End foreach
        
        return str_replace(' ', '&nbsp;', implode("\n", $mod));
        
    } // tabs2spaces()
    
    /*
     Stick all of the <span> tags in there to make it colorful
     -- This is XHTML 1.1 compliant!
     */
    function apply_attributes()
    {
        
        for ($i=0; $i<count($this->type_sequence); $i++)
        { //Using for() for easier playing with $i
            
            $type = $this->type_sequence[$i];
            switch ($type)
            { //The default switch does it's own sub-conditioning
            
                //Apply color
                case JSH_T_STRING:
                $this->converted_sequence[$i] = '<span style="color:'.$this->stringColor.'">'.
                    $this->converted_sequence[$i].'</span>';
                break;
                case JSH_T_FUNCTION:
                $this->converted_sequence[$i] = '<span style="color:'.$this->functionColor.'">'.
                    $this->converted_sequence[$i].'</span>';
                break;
                case JSH_T_METHOD:
                $this->converted_sequence[$i] = '<span style="color:'.$this->methodColor.'">'.
                    $this->converted_sequence[$i].'</span>';
                break;
                case JSH_T_MAIN_OBJ:
                $this->converted_sequence[$i] = '<span style="color:'.$this->mainObjColor.'">'.
                    $this->converted_sequence[$i].'</span>';
                break;
                case JSH_T_OBJECT:
                $this->converted_sequence[$i] = '<span style="color:'.$this->objectColor.'">'.
                    $this->converted_sequence[$i].'</span>';
                break;
                case JSH_T_NUM:
                $this->converted_sequence[$i] = '<span style="color:'.$this->numberColor.'">'.
                    $this->converted_sequence[$i].'</span>';
                break;
                case JSH_T_COMMENT:
                $this->converted_sequence[$i] = '<span style="color:'.$this->commentColor.'">'.
                    $this->converted_sequence[$i].'</span>';
                break;
                default:
                /*
                 It would work without this but the output HTML would be huge since
                 -- ALL symbols and keywords would have individual <span> tags.
                 This makes sure they get bunched together. It'll will make browser
                 -- parsing time faster and reduce bandwidth on large scripts.
                 */
                if ($type == JSH_T_RESERVED_WORD
                    || $type == JSH_T_SYMBOL)
                { //These are the only ones bunching is worth doing on
                    
                    if (isset($this->type_sequence[$i+1]))
                    { //Still something infront to check
                        /*
                         We've started the <span> but we can't close it til we know what's ahead!
                         */
                        $this->converted_sequence[$i] = '<span style="color:'.$this->keywordColor.'">'.
                            $this->converted_sequence[$i];
                        
                        //$done is a swicth so we know if we were successful doing the bunching
                        $done = false;
                        while (isset($this->type_sequence[$i+1]))
                        { //Still more code ahead
                            
                            if ($this->type_sequence[$i+1] == JSH_T_WHITESPACE
                                || $this->type_sequence[$i+1] == JSH_T_RESERVED_WORD
                                || $this->type_sequence[$i+1] == JSH_T_SYMBOL)
                            { //Found space, or more symbols... there could be more so we continue
                                
                                $i++;
                                continue;
                                
                            }
                            else
                            { //No more space or symbols... this is the last one so close the <span> here
                                $this->converted_sequence[$i] .= '</span>';
                                $done = true;
                                break;
                            } //End if
                            
                        } //End while
                        
                        if (!$done)
                        { //We didn't manage to bunch anything but <span> is unclosed... close it now
                            $this->converted_sequence[$i] .= '</span>';
                        } //End if
                    }
                    else
                    { //Nothing else ahead anyway so just close now
                        $this->converted_sequence[$i] = '<span style="color:'.$this->keywordColor.'">'.
                            $this->converted_sequence[$i].'</span>';
                    } //End if
                    
                } //End if
                break;
                    
            } //End switch
                
        } //End foreach
        
    } //apply_attributes()
    
    /*
     Setting the optional parameter to true causes Generate() to
     -- only return the data rather than output it to the page
     */
     
    function Generate($ret=false)
    {
        
        $this->apply_attributes(); //Beautify
        /*
         Putting the default color <span> tags around the entire output
         -- catches anything not colored.
         */
        if (!$ret)
        { //Print to page
            echo '<span style="color:'.$this->defaultColor.'">'.
                nl2br(implode('', $this->converted_sequence)).'</span>';
            return true;
        }
        else
        { //Return as var
            return '<span style="color:'.$this->defaultColor.'">'.
                nl2br(implode('', $this->converted_sequence)).'</span>';
        } //End if
        
    } //output()
    
    /*
     Just some customizations that may optionally be called
     -- before outputting
     */
     
    function SetHTMLColor($c)
    {
        
        if ($this->htmlColor = $c)
        {
            return true;
        }
        else
        {
            return false;
        } //End if
        
    } //setHTMLColor()
    
    function SetDefaultColor($c)
    {
        
        if ($this->defaultColor = $c)
        {
            return true;
        }
        else
        {
            return false;
        } //End if
        
    } //setDefaultColor()
    
    function SetKeywordColor($c)
    {
        
        if ($this->keywordColor = $c)
        {
            return true;
        }
        else
        {
            return false;
        } //End if
        
    } //setKeywordColor()
    
    function SetStringColor($c)
    {
        
        if ($this->stringColor = $c)
        {
            return true;
        }
        else
        {
            return false;
        } //End if
        
    } //setStringColor()
    
    function SetObjectColor($c)
    {
        
        if ($this->objectColor = $c)
        {
            return true;
        }
        else
        {
            return false;
        } //End if
        
    } //setObjectColor()
    
    function SetMainObjColor($c)
    {
        
        if ($this->mainObjColor = $c)
        {
            return true;
        }
        else
        {
            return false;
        } //End if
        
    } //setObjectColor()
    
    function SetNumberColor($c)
    {
        
        if ($this->numberColor = $c)
        {
            return true;
        }
        else
        {
            return false;
        } //End if
        
    } //setNumberColor()
    
    function SetCommentColor($c)
    {
        
        if ($this->commentColor = $c)
        {
            return true;
        }
        else
        {
            return false;
        } //End if
        
    } //setCommentColor()
    
    function SetMethodColor($c)
    {
        
        if ($this->methodColor = $c)
        {
            return true;
        }
        else
        {
            return false;
        } //End if
        
    } //setMethodColor()
    
    function __destruct()
    {
    
        //
    
    } //Destruct

}

?>
The very most basic use is the just instantiate it with the JavaScript source code as the constructor parameter and then to call $JSHighlight->Generate();

Calling Generate(true); stop it outputting code and returns it as a variable.

If you don't like the manky colors I chose then either:
a) Change them at the top of the class in the code itself
b) Call the methods $JSHighlight->SetCommentColor('anything_valid_in_html') BEFORE calling Generate()...

Also, the array up at the top (hey, all the setup is at the top) xconatin the keywords etc it looks for... modify that to your liking. I might post a more concise version in a day or two anyway :D

I'll make some documentation for it though, I'm just to eager to let people use it :P

Example:

Code: Select all

$data = file_get_contents('some_file.js');
$JSHighlight = new JSHighlight($data);
$JSHighlight->Generate();
This is what it does to my write_r() function....

Image

One thing to note at this point.... if HTML is included most of it will go Magenta cos it will just try and parse it... I'm working on allowing HTML in it now (but for my needs this was enough) ;)
Last edited by Chris Corbyn on Fri Jul 22, 2005 4:06 pm, edited 10 times in total.
User avatar
nielsene
DevNet Resident
Posts: 1834
Joined: Fri Aug 16, 2002 8:57 am
Location: Watertown, MA

Post by nielsene »

Very classy ;)

Honestly, very nice. If I ever used JS, I'd definitely have a use for this!
User avatar
Chris Corbyn
Breakbeat Nuttzer
Posts: 13098
Joined: Wed Mar 24, 2004 7:57 am
Location: Melbourne, Australia

Post by Chris Corbyn »

Thanks - I've seen a few others but they don't do it properly, they highlight one or two words but they don't parse the code like this... theoretically I could take this further with pretty little effort and make it fit for an editor or something in CMS.

I'm actually pretty surprised that it's doesn't lag much on longer scripts neither

http://www.w3style.co.uk/Javascript_Hig ... r/test.php
User avatar
nielsene
DevNet Resident
Posts: 1834
Joined: Fri Aug 16, 2002 8:57 am
Location: Watertown, MA

Post by nielsene »

Well now all you need is a higher-level function to wrap your jacascript highlighter, the php highlghter, and one for CSS and HTML, so you can pass in composite pages and it picks the appropriate mode for each :)
User avatar
Burrito
Spockulator
Posts: 4715
Joined: Wed Feb 04, 2004 8:15 pm
Location: Eden, Utah

Post by Burrito »

I kinda like you d11
User avatar
Chris Corbyn
Breakbeat Nuttzer
Posts: 13098
Joined: Wed Mar 24, 2004 7:57 am
Location: Melbourne, Australia

Post by Chris Corbyn »

Burrito wrote:I kinda like you d11
Wha? huh? :oops:
User avatar
Chris Corbyn
Breakbeat Nuttzer
Posts: 13098
Joined: Wed Mar 24, 2004 7:57 am
Location: Melbourne, Australia

Post by Chris Corbyn »

nielsene wrote:Well now all you need is a higher-level function to wrap your jacascript highlighter, the php highlghter, and one for CSS and HTML, so you can pass in composite pages and it picks the appropriate mode for each :)
*ding* :idea:

That's beautiful :)

CSS was coming next anyway ;)
User avatar
Chris Corbyn
Breakbeat Nuttzer
Posts: 13098
Joined: Wed Mar 24, 2004 7:57 am
Location: Melbourne, Australia

Post by Chris Corbyn »

User avatar
patrikG
DevNet Master
Posts: 4235
Joined: Thu Aug 15, 2002 5:53 am
Location: Sussex, UK

Post by patrikG »

Not wanting to derail the thread at all *cough*, but I was quite sad at the end of the new Harry Potter. Nice script, d11wtq :)
User avatar
nigma
DevNet Resident
Posts: 1094
Joined: Sat Jan 25, 2003 1:49 am

Post by nigma »

From this point on, consider the thread most definitely derailed.
patrikG wrote:Not wanting to derail the thread at all *cough*, but I was quite sad at the end of the new Harry Potter. Nice script, d11wtq :)
As was I :( Hopefully we'll find out more about his death in the second book, like maybe that it was all a ploy to keep ??? secure on the "dark side." If we don't find at more about his death the least Rowling could do is fill us in as to why ??? trusted ??? for so long.

Anyone else find themselves taking more of a liking to Tom Riddle's young self than Harry's?


edit patrikG: removed the spoilers.
User avatar
John Cartwright
Site Admin
Posts: 11470
Joined: Tue Dec 23, 2003 2:10 am
Location: Toronto
Contact:

Post by John Cartwright »

Why don't ya just tell the whole world :evil:
User avatar
nigma
DevNet Resident
Posts: 1094
Joined: Sat Jan 25, 2003 1:49 am

Post by nigma »

Perhaps this subject is deserving of its own thread? We could, of course, include apropriate warnings about spoilers in the title ;)
Post Reply