difficulty with binary files in general

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
User avatar
cj5
Forum Commoner
Posts: 60
Joined: Tue Jan 17, 2006 3:38 pm
Location: Long Island, NY, USA

difficulty with binary files in general

Post by cj5 »

feyd | This thread was split from here.


I have been having the same problem with SHP files for generating map images. I think my dilemna is how do I determine the format string for unpacking certain data. For example, I know that "V" is the format for unpacking a "unsigned long (always 32 bit, little endian byte order)", but in one code example I was looking at:

Code: Select all

unpack("V1V", substr($this->_content, 32, 4));
What is the "1V" for? Why is it needed to unpack this section of the data string? How did the developer conclude in using that format string? Are there any applications out there, that would help me visualize the binary data, in order to successfully unpack data strings?

CJ...
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

1 is the repeat number, V is the name to give the entry.

Code: Select all

[feyd@home]>php -r "$a = ''; for($i = 0; $i < 8; $i++) $a .= chr(mt_rand(0,127)); var_dump(unpack('V', $a));"
array(1) {
  [1]=>
  int(1550713455)
}

[feyd@home]>php -r "$a = ''; for($i = 0; $i < 8; $i++) $a .= chr(mt_rand(0,127)); var_dump(unpack('V1V', $a));"
array(1) {
  ["V"]=>
  int(391264100)
}

[feyd@home]>php -r "$a = ''; for($i = 0; $i < 8; $i++) $a .= chr(mt_rand(0,127)); var_dump(unpack('V2V', $a));"
array(2) {
  ["V1"]=>
  int(1131479379)
  ["V2"]=>
  int(1310208373)
}
User avatar
cj5
Forum Commoner
Posts: 60
Joined: Tue Jan 17, 2006 3:38 pm
Location: Long Island, NY, USA

Post by cj5 »

feyd,

Thanks, that was very helpful. Cleared up a lot for me. Seems the manual doesn't really get into the details much about the pack/unpack features. What's the best way to analyze a binary file like this, so I don't have to look at ascii in the terminal?
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

I'd use a text editor that supports opening files in binary mode (often displays the hex along side the character data)

The one I'm thinking of specifically is TextPad.
User avatar
cj5
Forum Commoner
Posts: 60
Joined: Tue Jan 17, 2006 3:38 pm
Location: Long Island, NY, USA

Post by cj5 »

So, I downloaded and opened a file with TextPad in binary format. How do I determine where I need to extract the information?
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

on the far left is the starting byte offset. Each line is 16 bytes. Find the starting line, then count however many bytes in. That'll be your starting offset.
User avatar
cj5
Forum Commoner
Posts: 60
Joined: Tue Jan 17, 2006 3:38 pm
Location: Long Island, NY, USA

Post by cj5 »

Is there a good resource online to goto and find out how to do this? I'm still not understanding your description. I mean, how does 2A represent my byte offset? Is that the hexidecimal representation of a number of bytes? The offset number of bytes is an offset from what initial byte place? Please, be patient with me, I'm a bitwise noob.
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

yes, the numbers are in hex. The left most column (offsets) is from the start of the file. The middle column is the binary data in hex (two hex digits for each byte.) The right column is a printable version of the bytes. So 2A is 42 bytes from the beginning of the file (zeroth byte.)
Post Reply