Send binary data via GET

Not for 'how-to' coding questions but PHP theory instead, this forum is here for those of us who wish to learn about design aspects of programming with PHP.

Moderator: General Moderators

Post Reply
User avatar
Ambush Commander
DevNet Master
Posts: 3698
Joined: Mon Oct 25, 2004 9:29 pm
Location: New Jersey, US

Send binary data via GET

Post by Ambush Commander »

I am currently writing code to validate URIs. While I was working on it, I realized that using percent-encoding would, theoretically speaking, enable people to insert binary data into the URI: i.e. %00.

W3C's document on this matter is ambiguous: it does not mention binary data specifically but does comment on non-ASCII characters, but in the context of actual character encodings.

This leads me to believe that binary data is not meant to be transferred via HTTP GET. This also implies that the URI should be well-formed UTF-8 after decoding everything. How interesting. Of course, it could be that none of this really matters.

What do you think?
User avatar
aaronhall
DevNet Resident
Posts: 1040
Joined: Tue Aug 13, 2002 5:10 pm
Location: Back in Phoenix, missing the microbrews
Contact:

Post by aaronhall »

User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

I know binary data is sent via URL all the time. Why? Take a look at the hacking attempts in your server request logs. :)
User avatar
Ambush Commander
DevNet Master
Posts: 3698
Joined: Mon Oct 25, 2004 9:29 pm
Location: New Jersey, US

Post by Ambush Commander »

Okay, so we've pretty well established that it is possible to send binary data through URLs. Is it desirable behaviour, though? Is it malicious enough to warrant implementing checks against?
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

Ambush Commander wrote:Is it desirable behaviour, though?
Arbitrary binary, absolutely not. However because character encodings can hit most if not all binary values it certainly could be, technically.
User avatar
Ambush Commander
DevNet Master
Posts: 3698
Joined: Mon Oct 25, 2004 9:29 pm
Location: New Jersey, US

Post by Ambush Commander »

Well, as I said, I'd be checking for well-formedness of the character encoding. I guess this makes things pretty clear. ::sigh::
Post Reply