Kodify - New Syntax Highlighter

Ye' old general discussion board. Basically, for everything that isn't covered elsewhere. Come here to shoot the breeze, shoot your mouth off, or whatever suits your fancy.
This forum is not for asking programming related questions.

Moderator: General Moderators

User avatar
Chris Corbyn
Breakbeat Nuttzer
Posts: 13098
Joined: Wed Mar 24, 2004 7:57 am
Location: Melbourne, Australia

Kodify - New Syntax Highlighter

Post by Chris Corbyn »

Addendum: I've now added bracket pairing... visible in the demo

Just thought I'd share something I've been working on and off for a while, that's now finished :)

It's a syntax highlighter with a difference. I can't stand the markup that syntax highlighters generate (bloated and non-semantic). What my version, Kodify does is operates on the client side as a simple progressive enhancement using JavaScript.

If you have JS turned off then you see the source code just fine. If you have JS turned on then you get a colorful version of the source code. Simple.

The other thing the Kodify does differently is that it fully lexically scan the code. I mean, it doesn't just use a big regex which is very slow and limiting... instead it uses a lexical analzyer routine based on C's lex.

It's finished in the sense that the engine and the lexical analyzer (another project of mine) is built... it just needs a whole heap of language specifications adding (community effort would be nice, since I don't know all languages!).

I just threw together the JS language specification to show off what it does.

I haven't optimized it heavily (yet) but it's still reallly fast due to the lexical analysis routine it uses (say 11,000 bytes of source in under 100ms).

It binds to code blocks identified with the "kodify" class name along with the language (e.g. <pre class="kodify js">).

I will make it do a generic highlight (strings and comments) for unspecified languages.

I've tested this on the following browsers:
  • Internet Explorer 6.0 (I'd like to try IE 7 and 8 but don't have access to them)
  • Opera 9.6
  • Safari 3.0
  • iPhone
  • Firefox 3.0
Example output of this code:
http://w3style.co.uk/~d11wtq/kodify/demo/

Code: Select all

<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
  "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xml:lang="en">
  <head>
    <title>Kodify Demo</title>
    <link rel="stylesheet" type="text/css" href="../themes/blackboard.css" />
    <script type="text/javascript" src="../js/lx_analyzer.js"></script>
    <script type="text/javascript" src="../js/kodify.js"></script>
    <script type="text/javascript" src="../js/lang/js.js"></script>
  </head>
  <body>
    <div class="intro">
      <h1>JavaScript Source Code</h1>
      <p>
        View this page with JavaScript enabled, then try it with JavaScript turned off.
      </p>
      <p>
        View the HTML source and see how clear and semantic it is.
      </p>
    </div>
    
    <div class="example">
      <h2>JavaScript</h2>
      <code>
        <pre class="kodify js">
/**
 * This is a comment.
 */
var ClassA = function ClassA(argName) {
  this.publicProperty = argName;
  
  /** @private */
  var _privateVar = 42;
  
  this.methodName = function methodName(a, b, c) {
    return window.confirm(a + b + c);
  };
  
};
 
ClassA.prototype.otherMethod = function otherMethod() {
  this.publicProperty = 0xFF;
};
 
//Strings work fine and dandy
var regex = new RegExp("Word\\s+\"moon\"");
 
//RegExp literals are detected
var regexLiteral = /Word\s+"moon"/;
 
//The / c / part of this is not detected as a regex
var x = a + b / c / d * 9;
 
#Single line comments work
doSomething(/regex here/);
 
        </pre>
      </code>
    </div>
  </body>
</html>
 
Anybody likely to use this once I add support for lots of other languages and create new themes?

Things is definitely WILL add:
  • As many languages as I can get (I'll ask others to write the specs)
  • Heaps of themes
  • Support for bracket pairing (hover on a bracket to see the matching one)
  • Support for non-obtrusive line numbering, so you can copy & paste without the line numbers
  • Support for embedded languages (such as PHP/HTML, HTML/JavaScript, HTML/CSS)
User avatar
Chris Corbyn
Breakbeat Nuttzer
Posts: 13098
Joined: Wed Mar 24, 2004 7:57 am
Location: Melbourne, Australia

Re: Kodify - New Syntax Highlighter

Post by Chris Corbyn »

Just added PHP support (visible in the demo). Too easy! :)
josh
DevNet Master
Posts: 4872
Joined: Wed Feb 11, 2004 3:23 pm
Location: Palm beach, Florida

Re: Kodify - New Syntax Highlighter

Post by josh »

Very nice, if you ran it on it's own source code would the space time continuum be corrupted?
User avatar
Chris Corbyn
Breakbeat Nuttzer
Posts: 13098
Joined: Wed Mar 24, 2004 7:57 am
Location: Melbourne, Australia

Re: Kodify - New Syntax Highlighter

Post by Chris Corbyn »

jshpro2 wrote:Very nice, if you ran it on it's own source code would the space time continuum be corrupted?
I've had so many near misses with raptors I'm not prepared to attempt it again 8O

No, actually I've linked to it highlighting its own source code in the Coding Critique forum.
User avatar
panic!
Forum Regular
Posts: 516
Joined: Mon Jul 31, 2006 7:59 am
Location: Brighton, UK

Re: Kodify - New Syntax Highlighter

Post by panic! »

great work, so impressed mate!
User avatar
Chris Corbyn
Breakbeat Nuttzer
Posts: 13098
Joined: Wed Mar 24, 2004 7:57 am
Location: Melbourne, Australia

Re: Kodify - New Syntax Highlighter

Post by Chris Corbyn »

I've registered kodify.org and will get something more complete up there soon :)

To be honest I need contributors who can write language specs for the languages I don't know (and people who have an eye for good color schemes). The code is not quite ready for that yet but I'll ask when I need people :)

To come:

Code collapse (easy since I already pair up brackets, though collapsing XML/HTML is a slightly different ballgame)
Line Numbering

I'm also very curious if I could integrate (as a plugin) with TinyMCE/FCKEditor so that they act a little bit like an IDE for writing code in forums and stuff.
alex.barylski
DevNet Evangelist
Posts: 6267
Joined: Tue Dec 21, 2004 5:00 pm
Location: Winnipeg

Re: Kodify - New Syntax Highlighter

Post by alex.barylski »

I'm also very curious if I could integrate (as a plugin) with TinyMCE/FCKEditor so that they act a little bit like an IDE for writing code in forums and stuff.
That is what I was just about to suggest.

I played around with a similar idea years back...first using regex...which turned out to be extremely slow as the regex was executed each time a key was pressed.

Then I considered implementing a caret tracker, so only regex was invoked when the changes were applied outside of already colorized tokens. For instance, when editing in a string which is already colored (say red) there is no need to run the regex.

To further optimize, if you could determine what text was not in the current viewport, you could avoid regex'ing all non-visible text.

I'm not sure how fast something like that would be, but I see IDE's eventually being web based -- at least for PHP based projects.

That was actually the intent behind my TexoCMS (http://www.sourceforge.net/projects/texocms). I wanted something like a CMS and eventually an IDE so I could build a web site using templates and manage any code changes within the browser itself.

A while back someone posted a JS project which actually did something like this...but of course I cannot find it now. :P

Cheers,
Alex
User avatar
Kieran Huggins
DevNet Master
Posts: 3635
Joined: Wed Dec 06, 2006 4:14 pm
Location: Toronto, Canada
Contact:

Re: Kodify - New Syntax Highlighter

Post by Kieran Huggins »

This is totally rad.
josh
DevNet Master
Posts: 4872
Joined: Wed Feb 11, 2004 3:23 pm
Location: Palm beach, Florida

Re: Kodify - New Syntax Highlighter

Post by josh »

I was actually thinkin about that... it would be cool to be able to let clients update their templates in a javascript powered IDE, not even necessarily WYSIWYG integrated.. you could make it do smarty / whatever... A while ago I made an editor for CSS, it was in PHP and didn't use AJAX but it used dropdowns for valid attributes instead of letting/making the user type
User avatar
Chris Corbyn
Breakbeat Nuttzer
Posts: 13098
Joined: Wed Mar 24, 2004 7:57 am
Location: Melbourne, Australia

Re: Kodify - New Syntax Highlighter

Post by Chris Corbyn »

I'm fairly sure this would be possible to do. It's probably simpler than a RTE since it doesn't have to generate HTML, it would be faulty if it did generate HTML ;) The HTML view is purely in memory at the DOM.

The way editors do the lexing so quickly (AFAIK) is that they only operate on X lines surrounding what you're editting (and no further than the viewport). If changes don't propogate further than that then it's all good, otherwise the lexical analyzer can move to the next block of code and decide if that needs updating.

Knowing how TextMate works I'm not sure how many editors use proper lexical analysis though... I'm fairly sure they just do crazy regex work.

The lexical analysis routines in Kodify are "programmed" in JavaScript, wrapping what is essentially a framework for stack-based lexical analysis.

For example, to match a double quoted string (API subtly different to the public version here, but algorithm the same):

Code: Select all

//At top with other config settings
Lx.state("DOUBLE_STRING");
 
//Switch states when a " is seen, so now we only find tokens in the DOUBLE_STRING state
Kodify.rule('"', Lx.INITIAL).peforms(function() {
  Kodify.matchedToken().class("string").append();
  Lx.PushState(Lx.DOUBLE_STRING);
});
 
//Copy all string contents, only allowing escaped double quotes
Kodify.rule(/(?:\\?[^"\\]|\\\\|\\")+/, Lx.DOUBLE_STRING).performs(function() {
  Kodify.matchedToken().class("string").append();
});
 
//Go back to the previous state (pop the current state of the state stack) when the next " is hit
Kodify.rule('"', Lx.DOUBLE_STRING).peforms(function() {
  Kodify.matchedToken().class("string").append();
  Lx.PopState();
});
 
Since a lot of this is "boilerplate" code that will be present in almost all language declarations I'll provide wrappers in either Kodify (the highlighter) or Lx (the lexical analyzer) to do this. I already provide such wrapper for matching things like /* comments */

Code: Select all

Kodify.rule("/*").performs(function() {
  Kodify.continueUntil("*/");
  Kodify.matchedToken().class("comment multiline").append();
});
Effectively the state stack means that you're not wasting cycles looking for tokens that cannot syntactically exist at certain points, and it also means you can distinguish say a function parameter from any other variable.

I've adopted "standard" set of class names with subclasses of those. For example a string must be output with the class name of "string" so that the theme CSS file works. But I have "string literal" and "string heredoc" too so in the theme file ".string" is a catch all for strings of all types with more fine grained rules for ".string.literal" if you want to highlight those differently in your theme. Same goes for comments, variables and other types.

I'm quite excited about the possibilities :)
User avatar
jayshields
DevNet Resident
Posts: 1912
Joined: Mon Aug 22, 2005 12:11 pm
Location: Leeds/Manchester, England

Re: Kodify - New Syntax Highlighter

Post by jayshields »

Have you checked out EtherPad? It does JavaScript syntax highlighting on-the-fly.
User avatar
Chris Corbyn
Breakbeat Nuttzer
Posts: 13098
Joined: Wed Mar 24, 2004 7:57 am
Location: Melbourne, Australia

Re: Kodify - New Syntax Highlighter

Post by Chris Corbyn »

Hadn't seen that before no. Just had a look and bookmarked it for later reference :)
User avatar
papa
Forum Regular
Posts: 958
Joined: Wed Aug 27, 2008 3:36 am
Location: Sweden/Sthlm

Re: Kodify - New Syntax Highlighter

Post by papa »

Looks very nice Chris Corbyn!

Let me know if I can help with the themes. :)
User avatar
Chris Corbyn
Breakbeat Nuttzer
Posts: 13098
Joined: Wed Mar 24, 2004 7:57 am
Location: Melbourne, Australia

Re: Kodify - New Syntax Highlighter

Post by Chris Corbyn »

papa wrote:Looks very nice Chris Corbyn!

Let me know if I can help with the themes. :)
You can! The more the merrier. Let me finalize things a little more over the next couple of days (no point someone writing themes if things will change halfway through) then I'll be calling for help with themes and with new languages :)

I'm building the website at the moment so I can go public with a handful of languages that I know myself with a note on the website asking for contributors. It'd be really easy to put a WYSIWYG theme creator (written in JS) on the site too.

2-3 days.
User avatar
Luke
The Ninja Space Mod
Posts: 6424
Joined: Fri Aug 05, 2005 1:53 pm
Location: Paradise, CA

Re: Kodify - New Syntax Highlighter

Post by Luke »

Hey Chris! What's the status on this thing? I have been looking for a good code highlighter plugin for wordpress and have not been able to find any decent ones. I think I'm going to turn your kodify into a wordpress plugin, would you mind?

EDIT: I'm also going to build a few themes for it. I would like a theme that looks like the default textmate theme.
Post Reply