MySQL highlighter

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

superdezign wrote:And... if he used multiline... would he have to check against a newline character?
Nope. In multi-line mode matching ends at the new-line already unless it somehow matches your pattern.

Also be careful with multi-line and single-line comments.. They can get nested.
User avatar
superdezign
DevNet Master
Posts: 4135
Joined: Sat Jan 20, 2007 11:06 pm

Post by superdezign »

Yeah, the multiline comment nesting is a problem we were on for quite some time, but our solution was to remove matching HTML tags within the multiline comments after their highlighted.

And '.' matches whitespace and newlines as well, so would there need to be a character class that included whitespace, but not newlines?


And, since we're on the topic, I heard of something called "lexic" (?) and tokenizing, and it sounds like it's meant for parsing and such. Would you know of any good resources where I could read up on it and it's relation to regex? The few things I've read thus far are very... cryptic.
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

superdezign wrote:Yeah, the multiline comment nesting is a problem we were on for quite some time, but our solution was to remove matching HTML tags within the multiline comments after their highlighted.
That sounds like extra work.
superdezign wrote:And '.' matches whitespace and newlines as well, so would there need to be a character class that included whitespace, but not newlines?
Not in multi-line mode. New-lines are added to dot's matching in single-line mode.
superdezign wrote:And, since we're on the topic, I heard of something called "lexic" (?) and tokenizing, and it sounds like it's meant for parsing and such. Would you know of any good resources where I could read up on it and it's relation to regex? The few things I've read thus far are very... cryptic.
Yes, tokenizing and lexing both deal with parsing of arbitrary texts and binary streams.

edit: fixed bbcode boo-boo.
Last edited by feyd on Sun Jun 17, 2007 8:56 am, edited 1 time in total.
User avatar
stereofrog
Forum Contributor
Posts: 386
Joined: Mon Dec 04, 2006 6:10 am

Post by stereofrog »

superdezign wrote: And, since we're on the topic, I heard of something called "lexic" (?) and tokenizing, and it sounds like it's meant for parsing and such. Would you know of any good resources where I could read up on it and it's relation to regex? The few things I've read thus far are very... cryptic.
http://en.wikipedia.org/wiki/Lexer
http://en.wikipedia.org/wiki/Parser
User avatar
superdezign
DevNet Master
Posts: 4135
Joined: Sat Jan 20, 2007 11:06 pm

Post by superdezign »

Thanks. ^_^

And feyd, I just tested using '|' as a delimiter, and it works fine. :D
ziggy3000
Forum Contributor
Posts: 205
Joined: Fri Mar 23, 2007 3:04 pm

Post by ziggy3000 »

superdezign wrote:Hehe, I wasn't sure if '|' could be a delimiter, but I figured that if it couldn't, ziggy would let me know. :P

And... if he used multiline... would he have to check against a newline character?

Code: Select all

((//|#)[^\n]+?)
wouldn't [^\n] not highlight "" and "n" in the commented line?
User avatar
superdezign
DevNet Master
Posts: 4135
Joined: Sat Jan 20, 2007 11:06 pm

Post by superdezign »

No. "\n" is one character.
User avatar
superdezign
DevNet Master
Posts: 4135
Joined: Sat Jan 20, 2007 11:06 pm

Post by superdezign »

superdezign wrote:Thanks. ^_^

And feyd, I just tested using '|' as a delimiter, and it works fine. :D
... If there's no '|' used elsewhere in the pattern. I see now. ^_^
ziggy3000
Forum Contributor
Posts: 205
Joined: Fri Mar 23, 2007 3:04 pm

Post by ziggy3000 »

but the third line in my function replaces \n with <br />
so it wont do anything if i have [^\n] in my regex
ziggy3000
Forum Contributor
Posts: 205
Joined: Fri Mar 23, 2007 3:04 pm

Post by ziggy3000 »

here's my code now

Code: Select all

$sql = preg_replace("'((//|#)(.*))$'i", "<span class='comment'>\\1</span>", $sql);
but it still doesn't highlight numbers :(
User avatar
superdezign
DevNet Master
Posts: 4135
Joined: Sat Jan 20, 2007 11:06 pm

Post by superdezign »

What do you mean by "doesn't highlight?" Like, it stops highlight AT the numbers, or the numbers just don't change color?
ziggy3000
Forum Contributor
Posts: 205
Joined: Fri Mar 23, 2007 3:04 pm

Post by ziggy3000 »

it stops highlighting the comments when it meets a number in that line. another preg_replace highlights that number, and then the comment highlighter continues highlighting
example:

// Test Comment number 1
# This is the 2nd comment

edit: i just tried it again. it stops highlighting after it meets a number
the code i posted before was wrong
this is the new code

Code: Select all

$sql = preg_replace("'((//|#)((\w|\s)*))'i", "<span class='comment'>\\1</span>", $sql);
now the output is
// Test Comment number 1
# This is the 2nd comment
ziggy3000
Forum Contributor
Posts: 205
Joined: Fri Mar 23, 2007 3:04 pm

Post by ziggy3000 »

if i dont use the s modifier, how else would i allow spaces in my comments?
User avatar
superdezign
DevNet Master
Posts: 4135
Joined: Sat Jan 20, 2007 11:06 pm

Post by superdezign »

Yeah, I'm pretty sure that it doesn't stop at the number, then start back again afterwards. You have nested tags again.

Edit: It stops on the number? That's odd. That's very odd.
User avatar
superdezign
DevNet Master
Posts: 4135
Joined: Sat Jan 20, 2007 11:06 pm

Post by superdezign »

This works fine for me.

Code: Select all

(//|#)[^\r\n]+
It works with or without the 's' modifier. Comments exist from '//' or '#' until the meet an EOL indicator. If \r\n or \n is your EOL, then this will work. If it's <br />, then use this instead:

Code: Select all

(//|#).*?<br.*?>
Post Reply