Page 1 of 1

SOLVED rid ssl encoded html page of cryptic codes

Posted: Thu Jun 30, 2005 4:34 am
by frk-
I need some help for this task:
I'm retrieving a html page over a connection and analyze it. The page comes out all right in a browser, but when I'm GETting it with PHP, there are new lines mixed into it, containing nothing but one out of these syllabilic codes:

fed, ffb, dde, 2b, 396, 397.

These make it difficult to analyze the code. I tried filtering them, but there might be more coming up, so I rather not hardcode them, but get rid of them at all.

At least, I'd like to know where they come from in the first place. Who can tell?

Here comes the gravy: Due to the page being .htaccess protected, I'm not fopen()-ing but use iniset to feign a browser, fsockopen a ssl connection with openssl loaded and compiled into my PHP, and use fputs to login with username and password.

Posted: Thu Jun 30, 2005 5:57 am
by Chris Corbyn
Read up on transfer encoding methods.... the page is being served using the HTTP 1.1 protocol with CHUNKED encoding ;)

I have a script (written by Burrito) which works through this using fsockopen but I'm at work and dont have the relevant parts here now :) nothing to do with the SSL...

works fine after changing HTTP 1.1 to 1.0

Posted: Thu Jun 30, 2005 6:26 am
by frk-
Thank you for pointing out the page's chunked.
I read up on it on
http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html
and found
"A server MUST NOT send transfer-codings to an HTTP/1.0 client."
So I changed the headers I was sending, changed HTTP version number from 1.1 to 1.0.
No more odd codes.

Posted: Thu Jun 30, 2005 6:44 am
by Roja
Please edit the topic to read "SOLVED". Thanks.