superdezign wrote:Yeah, the multiline comment nesting is a problem we were on for quite some time, but our solution was to remove matching HTML tags within the multiline comments after their highlighted.
That sounds like extra work.
superdezign wrote:And '.' matches whitespace and newlines as well, so would there need to be a character class that included whitespace, but not newlines?
Not in multi-line mode. New-lines are added to dot's matching in single-line mode.
superdezign wrote:And, since we're on the topic, I heard of something called "lexic" (?) and tokenizing, and it sounds like it's meant for parsing and such. Would you know of any good resources where I could read up on it and it's relation to regex? The few things I've read thus far are very... cryptic.
Yes, tokenizing and lexing both deal with parsing of arbitrary texts and binary streams.
edit: fixed bbcode boo-boo.