Page 2 of 2

Posted: Sat Jul 15, 2006 3:17 am
by bokehman
feyd wrote:
bokehman wrote:
sweatje wrote:

Code: Select all

<[^>]+>
The trouble with that is it will find things that are not html tags.
only in malformed text and, in general, malformed tags too.
Granted but something such as the following (which is not good but is valid HTML) will catch it.

Code: Select all

<p> if(1 < 2) </p>

Posted: Sat Jul 15, 2006 7:51 am
by sweatje
bokehman wrote:
feyd wrote:
bokehman wrote:The trouble with that is it will find things that are not html tags.
only in malformed text and, in general, malformed tags too.
Granted but something such as the following (which is not good but is valid HTML) will catch it.

Code: Select all

<p> if(1 < 2) </p>
That is not valid HTML. It should be:

Code: Select all

<p> if(1 < 2) </p>

Posted: Sat Jul 15, 2006 8:19 am
by bokehman
sweatje wrote:That is not valid HTML.
Yes it is... It's completely legal and it validates at http://validator.w3.org/ Also another combination which is completely legal that your regex would have trouble with is this:

Code: Select all

<element attribute=">">

Posted: Sat Jul 15, 2006 9:23 am
by sweatje
Ok, it is not valid xhtml.
w3 validator wrote: Below is a list of the warning message(s) produced when validating your document.

1. Warning Line 6 column 9: character "<" is the first character of a delimiter but occurred as data.

<p> if(1 < 2) </p>

This message may appear in several cases:
* You tried to include the "<" character in your page: you should escape it as "<"
* You used an unescaped ampersand "&": this may be valid in some contexts, but it is recommended to use "&", which is always safe.
* Another possibility is that you forgot to close quotes in a previous tag.
Anyway, unless Robert K S has more questions related to the topic of this thread we can probably just drop it.

Posted: Sat Jul 15, 2006 9:31 am
by bokehman
I agree! It's an interesting debate from the regex point of view but has little to do with the original post.