I got html code, and what i want to do is to split it into: tags, words, special html entities (for example:  , ©) and punctuation marks.
For example:
Code: Select all
<html>
<head>
Title, bla bla
</head>
<body>
some   text
</body>
</html>
Array = (
[0] = <html>
[1] = <head>
[2] = Title
[3] = ,
[4] = {space}
[5] = bla
....
How to do this?