Page 12 of 13
Posted: Sun Jun 17, 2007 7:55 am
by feyd
superdezign wrote:And... if he used multiline... would he have to check against a newline character?
Nope. In multi-line mode matching ends at the new-line already unless it somehow matches your pattern.
Also be careful with multi-line and single-line comments.. They can get nested.
Posted: Sun Jun 17, 2007 8:02 am
by superdezign
Yeah, the multiline comment nesting is a problem we were on for quite some time, but our solution was to remove matching HTML tags within the multiline comments after their highlighted.
And '.' matches whitespace and newlines as well, so would there need to be a character class that included whitespace, but not newlines?
And, since we're on the topic, I heard of something called "lexic" (?) and tokenizing, and it sounds like it's meant for parsing and such. Would you know of any good resources where I could read up on it and it's relation to regex? The few things I've read thus far are very... cryptic.
Posted: Sun Jun 17, 2007 8:07 am
by feyd
superdezign wrote:Yeah, the multiline comment nesting is a problem we were on for quite some time, but our solution was to remove matching HTML tags within the multiline comments after their highlighted.
That sounds like extra work.
superdezign wrote:And '.' matches whitespace and newlines as well, so would there need to be a character class that included whitespace, but not newlines?
Not in multi-line mode. New-lines are added to dot's matching in single-line mode.
superdezign wrote:And, since we're on the topic, I heard of something called "lexic" (?) and tokenizing, and it sounds like it's meant for parsing and such. Would you know of any good resources where I could read up on it and it's relation to regex? The few things I've read thus far are very... cryptic.
Yes, tokenizing and lexing both deal with parsing of arbitrary texts and binary streams.
edit: fixed bbcode boo-boo.
Posted: Sun Jun 17, 2007 8:34 am
by stereofrog
superdezign wrote:
And, since we're on the topic, I heard of something called "lexic" (?) and tokenizing, and it sounds like it's meant for parsing and such. Would you know of any good resources where I could read up on it and it's relation to regex? The few things I've read thus far are very... cryptic.
http://en.wikipedia.org/wiki/Lexer
http://en.wikipedia.org/wiki/Parser
Posted: Sun Jun 17, 2007 11:12 am
by superdezign
Thanks. ^_^
And feyd, I just tested using '|' as a delimiter, and it works fine.

Posted: Sun Jun 17, 2007 11:27 am
by ziggy3000
superdezign wrote:Hehe, I wasn't sure if '|' could be a delimiter, but I figured that if it couldn't, ziggy would let me know.
And... if he used multiline... would he have to check against a newline character?
wouldn't [^\n] not highlight "" and "n" in the commented line?
Posted: Sun Jun 17, 2007 11:31 am
by superdezign
No. "\n" is one character.
Posted: Sun Jun 17, 2007 11:32 am
by superdezign
superdezign wrote:Thanks. ^_^
And feyd, I just tested using '|' as a delimiter, and it works fine.

... If there's no '|' used elsewhere in the pattern. I see now. ^_^
Posted: Sun Jun 17, 2007 11:35 am
by ziggy3000
but the third line in my function replaces \n with <br />
so it wont do anything if i have [^\n] in my regex
Posted: Sun Jun 17, 2007 11:47 am
by ziggy3000
here's my code now
Code: Select all
$sql = preg_replace("'((//|#)(.*))$'i", "<span class='comment'>\\1</span>", $sql);
but it still doesn't highlight numbers

Posted: Sun Jun 17, 2007 12:00 pm
by superdezign
What do you mean by "doesn't highlight?" Like, it stops highlight AT the numbers, or the numbers just don't change color?
Posted: Sun Jun 17, 2007 12:21 pm
by ziggy3000
it stops highlighting the comments when it meets a number in that line. another preg_replace highlights that number, and then the comment highlighter continues highlighting
example:
// Test Comment number 1
# This is the 2nd comment
edit: i just tried it again. it stops highlighting after it meets a number
the code i posted before was wrong
this is the new code
Code: Select all
$sql = preg_replace("'((//|#)((\w|\s)*))'i", "<span class='comment'>\\1</span>", $sql);
now the output is
// Test Comment number 1
# This is the 2nd comment
Posted: Sun Jun 17, 2007 12:33 pm
by ziggy3000
if i dont use the s modifier, how else would i allow spaces in my comments?
Posted: Sun Jun 17, 2007 12:34 pm
by superdezign
Yeah, I'm pretty sure that it doesn't stop at the number, then start back again afterwards. You have nested tags again.
Edit: It stops on the number? That's odd. That's very odd.
Posted: Sun Jun 17, 2007 12:39 pm
by superdezign
This works fine for me.
It works with or without the 's' modifier. Comments exist from '//' or '#' until the meet an EOL indicator. If \r\n or \n is your EOL, then this will work. If it's <br />, then use this instead: