Page 1 of 1

Getting only text from html

Posted: Mon May 07, 2012 10:11 am
by andrecj
Hello,

I am wondering if it is possibble to only get the text content from this syntax.

Code: Select all

<p>thisss is a test.<img alt="" src="/userfiles/images/Fotos-0248.jpg" style="width: 300px; height: 225px; " /></p>
In this case will only get

Code: Select all

<p>thisss is a test.</p>
(this is what my db save, then I use a query to show the content to the user).


Regards

Re: Getting only text from html

Posted: Mon May 07, 2012 10:47 am
by requinix
But that's not "only the text content". I'm thinking you want either:
1. the root tag with its text content
2. that HTML without the <img> tag.

Re: Getting only text from html

Posted: Mon May 07, 2012 11:12 am
by andrecj
Yes that's it , I was thinking that was a function to only get the text, from an html tag, without the <img> tag and other tags...

While I was searching for some function (couldn't find anything...), I made this:

Code: Select all

function noimg($str){
                             $loop=substr_count($str,"<img");
                             for($x=0;$x<$loop;$x++) {
                                                                  $begin = strpos($str,"<img");
                                                                  $end = strpos($str,"/>",$begin);
                                                                  $strb=substr($str,0,$begin);
                                                                  $stre=substr($str,$end+2);
                                                                  $str=$strb.$stre;
                                                                  } return $str;
                              }
This will only remove the images from the string...

Re: Getting only text from html

Posted: Mon May 07, 2012 9:35 pm
by requinix
Grab the inner content (inside the <p>) and strip_tags it.

Re: Getting only text from html

Posted: Tue May 08, 2012 2:48 pm
by tr0gd0rr
If you use HTML Purifier, you can indicate in the options exactly what tags and attributes to allow and which to strip out. It is a full HTML parser that should not be fooled by malicious HTML (i.e. Cross Site Scripting).

Re: Getting only text from html

Posted: Wed May 09, 2012 12:24 pm
by andrecj
Hmmm alright, I'm going to test it. Appreciated for your help.