how to make view source to string
Posted: Fri Sep 24, 2010 2:59 am
Hi
Due to scrapping purpose, I need to remove white space from html view source of a web page (here whitespace can be due to newline, tab, space etc which is shown in "view source" by a web browser). Previously I did it successfully from all the page, but at one page I am failed. I can not find out what special entities making this space. Anybody please help me to remove all the spaces between words/characters from a html page source.
Here is the url:
$url = http://www.dsebd.org/margin_maintenance.htm;
$data = [scrapping the $url by curl. I am not mentioning the curl code here.]
and I actually use below code to remove all blank space:
$entity = array("\t","\n","/n","\r","\x20\x20","\0","\xOB");
$content = str_replace($entity,"",html_entity_decode($data));
Please see the view source of that page and tell me what should I include at $entity array to remove all line breaks and white space to make the view source as a one string.
Regards
Due to scrapping purpose, I need to remove white space from html view source of a web page (here whitespace can be due to newline, tab, space etc which is shown in "view source" by a web browser). Previously I did it successfully from all the page, but at one page I am failed. I can not find out what special entities making this space. Anybody please help me to remove all the spaces between words/characters from a html page source.
Here is the url:
$url = http://www.dsebd.org/margin_maintenance.htm;
$data = [scrapping the $url by curl. I am not mentioning the curl code here.]
and I actually use below code to remove all blank space:
$entity = array("\t","\n","/n","\r","\x20\x20","\0","\xOB");
$content = str_replace($entity,"",html_entity_decode($data));
Please see the view source of that page and tell me what should I include at $entity array to remove all line breaks and white space to make the view source as a one string.
Regards

