Page 1 of 1

PHP Torrent Decoder

Posted: Wed Oct 06, 2010 12:11 am
by s.dot
This class will parse torrent files and decode them into an associative array of usable information.

torrent_decoder.class.php

Code: Select all

<?php

/**
 * @Author: Scott Martin <sjm.dev1[at]gmail[dot]com>
 * @Filename: torrent_decoder.class.php
 * @Date: October 5th, 2010
 *
 * -- Description:
 * This is a torrent decoder class used to extract .torrent files into an
 * associative array of useable info.
 *
 * -- Usage:
 * require_once 'torrent_decoder.class.php';
 * $decoder = new torrent_decoder('path/to/file.torrent');
 * $torrent = $decoder->decode();
 * //print_r($torrent); //show all of the info provided by the torrent file
 *
 * -- Access Info:
 * $torrent now contains an array of useful info, for example
 * echo $torrent['announce']; //prints the announce URL
 *
 * @Liscense: GNU GPL V3
 -   Copyright (C) <2010>  <Scott Martin>
 -
 - This program is free software: you can redistribute it and/or modify
 - it under the terms of the GNU General Public License as published by
 - the Free Software Foundation, either version 3 of the License, or
 - (at your option) any later version.
 -
 - This program is distributed in the hope that it will be useful,
 - but WITHOUT ANY WARRANTY; without even the implied warranty of
 - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 - GNU General Public License for more details.
 -
 - You should have received a copy of the GNU General Public License
 - along with this program.  If not, see <http://www.gnu.org/licenses/>.
 */
 
class torrent_decoder
{
    private $contents = '';
    private $pos = 0;
    
    /**
     * When initiated the raw contents of the .torrent file are held 
     * in class member $contents.
     *
     * @access public
     * @param $file - filename of torrent
     * @return void
     */
    function __construct($file)
    {
        $this->contents = @file_get_contents($file);
    }
    
    /**
     * Starts the decoding method(s).
     * Throws exception if contents cannot be opened, is empty, or file cannot
     * be found.
     *
     * @access public
     * @param void
     * @return array
     */
    function decode()
    {
        if (empty($this->contents))
        {
            throw new exception('Torrent file is empty, cannot be opened, or cannot be found.');
            return;
        }
        
        $ret = $this->doChar();
        return $ret;
    }
    
    /**
     * Processes character at internal pointer position to check for an identifier.
     * Possible identifiers are 'd', 'l', 'i', and 0-9
     * Throws exception if an unknown character identifier is found.
     *
     * @access private
     * @param void
     * @return mixed
     */
    private function doChar()
    {    
        while ($this->contents[$this->pos] != 'e')
        {
            if ($this->contents[$this->pos] == 'd')
            {
                return $this->doDict();
            }
            elseif ($this->contents[$this->pos] == 'l')
            {
                return $this->doList();
            }
            elseif ($this->contents[$this->pos] == 'i')
            {
                return $this->doInt();
            }
            else
            {
                if (is_numeric($this->contents[$this->pos]))
                {
                    return $this->doString();
                } else
                {
                    throw new exception('Unknown character \'' . $this->contents[$this->pos] . '\' at position ' . $this->pos);
                    return;
                }
            }
        }
    }
    
    /**
     * Processes dictionary 'd' identifier.
     *
     * @access private
     * @param void
     * @return array
     */
    private function doDict()
    {
        $ret = array();
        $this->pos++;

        while ($this->contents[$this->pos] != 'e')
        {
            $key = $this->doString();

            if ($this->contents[$this->pos] == 'd')
            {
                $ret[$key] = $this->doDict();
            }
            elseif ($this->contents[$this->pos] == 'l')
            {
                $ret[$key] = $this->doList();
            }
            elseif ($this->contents[$this->pos] == 'i')
            {
                $ret[$key] = $this->doInt();
            } else
            {
                if (is_numeric($this->contents[$this->pos]))
                {
                    $ret[$key] = $this->doString();
                } else
                {
                    throw new exception('Unknown character \'' . $this->contents[$this->pos] . '\' at position ' . $this->pos);
                    return;
                }
            }
        }
        
        $this->pos++;
        
        return $ret;
    }
    
    /**
     * Processes strings found.
     *
     * @access private
     * @param void
     * @return string
     */
    private function doString()
    {
        $strlen = '';
        
        while (is_numeric($this->contents[$this->pos]))
        {
            $strlen .= $this->contents[$this->pos];
            $this->pos++;
        }
        
        if ($this->contents[$this->pos] == ':')
        {
            $this->pos++;
        }
        
        $strlen = intval($strlen);
        $str = substr($this->contents, $this->pos, $strlen);
        $this->pos = $this->pos + $strlen;
        
        return $str;
    }
    
    /**
     * Processes list 'l' identifiers and returns an array of 
     * items found in the list.
     *
     * @access private
     * @param void
     * @return array
     */
    private function doList()
    {
        $ret = array();
        $this->pos++;
        
        while ($this->contents[$this->pos] != 'e')
        {
            $ret[] = $this->doChar();
        }
        
        $this->pos++;

        return $ret;
    }
    
    /**
     * Processes integer 'i' identifier.
     *
     * @access private
     * @param void
     * @return integer
     */
    private function doInt()
    {
        $this->pos++;
        $int = '';
        
        while ($this->contents[$this->pos] != 'e')
        {
            $int .= $this->contents[$this->pos];
            $this->pos++;
        }
        
        $int = intval($int);
        $this->pos++;
        
        return $int;
    }
}
Usage

Code: Select all

<?php

require_once 'torrent_decoder.class.php';
$decoder = new torrent_decoder('path/to/file.torrent');
$torrent =  $decoder->decode();
Sample Output
I created a torrent file from a digital picture folder and this is the output

Code: Select all

Array
(
    [announce] => http://w1.example.com/track/announce
    [announce-list] => Array
        (
            [0] => Array
                (
                    [0] => http://w1.example.com/track/announce
                    [1] => http://w2.example.com/track/announce
                )

        )

    [comment] => created by scott
    [created by] => uTorrent/2040
    [creation date] => 1286340941
    [encoding] => UTF-8
    [info] => Array
        (
            [files] => Array
                (
                    [0] => Array
                        (
                            [length] => 361660
                            [path] => Array
                                (
                                    [0] => 100_0806.JPG
                                )

                        )

                    [1] => Array
                        (
                            [length] => 380832
                            [path] => Array
                                (
                                    [0] => 100_0807.JPG
                                )

                        )

                    [2] => Array
                        (
                            [length] => 375960
                            [path] => Array
                                (
                                    [0] => 100_0808.JPG
                                )

                        )

                    [3] => Array
                        (
                            [length] => 390620
                            [path] => Array
                                (
                                    [0] => 100_0809.JPG
                                )

                        )

                    [4] => Array
                        (
                            [length] => 187995
                            [path] => Array
                                (
                                    [0] => 100_0811.JPG
                                )

                        )

                    [5] => Array
                        (
                            [length] => 337143
                            [path] => Array
                                (
                                    [0] => 100_0812.JPG
                                )

                        )

                    [6] => Array
                        (
                            [length] => 401504
                            [path] => Array
                                (
                                    [0] => 100_0813.JPG
                                )

                        )

                    [7] => Array
                        (
                            [length] => 403132
                            [path] => Array
                                (
                                    [0] => 100_0814.JPG
                                )

                        )

                    [8] => Array
                        (
                            [length] => 407540
                            [path] => Array
                                (
                                    [0] => 100_0815.JPG
                                )

                        )

                    [9] => Array
                        (
                            [length] => 394160
                            [path] => Array
                                (
                                    [0] => 100_0816.JPG
                                )

                        )

                    [10] => Array
                        (
                            [length] => 350944
                            [path] => Array
                                (
                                    [0] => 100_0817.JPG
                                )

                        )

                    [11] => Array
                        (
                            [length] => 362988
                            [path] => Array
                                (
                                    [0] => 100_0818.JPG
                                )

                        )

                    [12] => Array
                        (
                            [length] => 412444
                            [path] => Array
                                (
                                    [0] => 100_0819.JPG
                                )

                        )

                    [13] => Array
                        (
                            [length] => 395276
                            [path] => Array
                                (
                                    [0] => 100_0820.JPG
                                )

                        )

                    [14] => Array
                        (
                            [length] => 356812
                            [path] => Array
                                (
                                    [0] => 100_0821.JPG
                                )

                        )

                    [15] => Array
                        (
                            [length] => 371620
                            [path] => Array
                                (
                                    [0] => 100_0822.JPG
                                )

                        )

                    [16] => Array
                        (
                            [length] => 363076
                            [path] => Array
                                (
                                    [0] => 100_0823.JPG
                                )

                        )

                    [17] => Array
                        (
                            [length] => 315440
                            [path] => Array
                                (
                                    [0] => 100_0824.JPG
                                )

                        )

                    [18] => Array
                        (
                            [length] => 351548
                            [path] => Array
                                (
                                    [0] => 100_0825.JPG
                                )

                        )

                    [19] => Array
                        (
                            [length] => 337143
                            [path] => Array
                                (
                                    [0] => 100_0826.JPG
                                )

                        )

                    [20] => Array
                        (
                            [length] => 934372
                            [path] => Array
                                (
                                    [0] => 100_0827.JPG
                                )

                        )

                    [21] => Array
                        (
                            [length] => 949040
                            [path] => Array
                                (
                                    [0] => 100_0828.JPG
                                )

                        )

                    [22] => Array
                        (
                            [length] => 936032
                            [path] => Array
                                (
                                    [0] => 100_0829.JPG
                                )

                        )

                    [23] => Array
                        (
                            [length] => 963268
                            [path] => Array
                                (
                                    [0] => 100_0830.JPG
                                )

                        )

                    [24] => Array
                        (
                            [length] => 962300
                            [path] => Array
                                (
                                    [0] => 100_0831.JPG
                                )

                        )

                    [25] => Array
                        (
                            [length] => 935208
                            [path] => Array
                                (
                                    [0] => 100_0832.JPG
                                )

                        )

                    [26] => Array
                        (
                            [length] => 908672
                            [path] => Array
                                (
                                    [0] => 100_0833.JPG
                                )

                        )

                    [27] => Array
                        (
                            [length] => 366944
                            [path] => Array
                                (
                                    [0] => 100_0834.JPG
                                )

                        )

                    [28] => Array
                        (
                            [length] => 440992
                            [path] => Array
                                (
                                    [0] => 100_0835.JPG
                                )

                        )

                    [29] => Array
                        (
                            [length] => 379968
                            [path] => Array
                                (
                                    [0] => 100_0836.JPG
                                )

                        )

                    [30] => Array
                        (
                            [length] => 390580
                            [path] => Array
                                (
                                    [0] => 100_0837.JPG
                                )

                        )

                    [31] => Array
                        (
                            [length] => 415060
                            [path] => Array
                                (
                                    [0] => 100_0838.JPG
                                )

                        )

                    [32] => Array
                        (
                            [length] => 370880
                            [path] => Array
                                (
                                    [0] => 100_0839.JPG
                                )

                        )

                    [33] => Array
                        (
                            [length] => 452056
                            [path] => Array
                                (
                                    [0] => 100_0840.JPG
                                )

                        )

                    [34] => Array
                        (
                            [length] => 288100
                            [path] => Array
                                (
                                    [0] => 100_0841.JPG
                                )

                        )

                    [35] => Array
                        (
                            [length] => 458384
                            [path] => Array
                                (
                                    [0] => 100_0842.JPG
                                )

                        )

                    [36] => Array
                        (
                            [length] => 455640
                            [path] => Array
                                (
                                    [0] => 100_0843.JPG
                                )

                        )

                )

            [name] => DC
            [piece length] => 65536
            [pieces] => [really long string here]
            [private] => 1
        )

)
This was very difficult to do since I started by just analyzing the torrent file syntax by myself. But then I found a specification and that made it a lot easier. I'm sure there are probably some things I could do better though.

I'm intrigued by torrents (im late catching on, i know :P) but I plan to make a tracker and figure out all the guts of everything and how it works. This was the first step.

Re: PHP Torrent Decoder

Posted: Thu Oct 07, 2010 1:38 pm
by Jonah Bron
That's pretty cool. The code is short and sweet. Because I'm not familiar with the Torrent file syntax, I can't really critique the parsing. It would be an interesting challenge to create a PHP torrent download program...

I've only downloaded via torrent like, two times... after seeing your parser I just had to read the Wikipedia page in it :)

Re: PHP Torrent Decoder

Posted: Thu Oct 07, 2010 2:44 pm
by s.dot
Jonah Bron wrote:That's pretty cool. The code is short and sweet. Because I'm not familiar with the Torrent file syntax, I can't really critique the parsing.
Thank you!

I believe it's called bencode'd.. so my parser would be a bdecode class.
The syntax is pretty simple.. I just found it a bit odd.
Specification is here: http://wiki.theory.org/BitTorrentSpecification
Jonah Bron wrote:It would be an interesting challenge to create a PHP torrent download program...
Probably impossible. It would be more suited for a desktop language like c. I'm not going to try lol.. very complicated it seems. However I would like to make a tracker.. and then distribute a torrent amongst a few of my buddies.. just to see that I could do it and watch it work.

I'm also working on a recursive function to do the same thing the code in the OP does.. but recursion is tricking me today.

Re: PHP Torrent Decoder

Posted: Thu Oct 07, 2010 3:16 pm
by pickle
Not much to say about the parsing because, ya - no experience with it.

I would recommend a slight change in how the class is used though. Since it's a single-use class, why not make it static? Also, I'd use camel case for the class name:

Code: Select all

require_once 'torrent_decoder.class.php';
$torrent = TorrentDecoder::decode('path/to/file.torrent');

Re: PHP Torrent Decoder

Posted: Thu Oct 07, 2010 3:22 pm
by Jonah Bron
Interesting idea. Another option would be, instead of being a (static) decoding class, turn it into a Torrent object. Maybe you could even add the functionality to encode a torrent?

Re: PHP Torrent Decoder

Posted: Fri Oct 08, 2010 1:17 pm
by josh
I'm against the static stuff. The class already has state. What if later it has more state? For example if a torrent has 50,000 files in it - why parse the whole thing just to get at the name? In this case, you'd be storing even more state and anything you did statically would hinder you or cause global state.

Also it should ideally have better defect localization. How about a way to validate it too?

Ideally it would work like this. Instantiate object pass in data or file stream, be able to call isValidTorrent() and get a boolean. Only if trying to parse before calling isValidTorrent() should I get an exception. Otherwise you're making me use exceptions for non exceptional situations.

You should also have different exception classes for different scenarios, file not found is a LOT different than file invalid, and people might want to catch only one or the other

Re: PHP Torrent Decoder

Posted: Fri Oct 08, 2010 1:35 pm
by s.dot
josh wrote:I'm against the static stuff. The class already has state. What if later it has more state? For example if a torrent has 50,000 files in it - why parse the whole thing just to get at the name? In this case, you'd be storing even more state and anything you did statically would hinder you or cause global state.
Actually, the whole thing would have to be parsed to be able to get the name.. correct? I don't see a way to get information without parsing and decoding the entire torrent?
josh wrote:Also it should ideally have better defect localization. How about a way to validate it too?
What does defect localization mean?
A way to validate is a good idea!
josh wrote:Ideally it would work like this. Instantiate object pass in data or file stream, be able to call isValidTorrent() and get a boolean. Only if trying to parse before calling isValidTorrent() should I get an exception. Otherwise you're making me use exceptions for non exceptional situations.
Very cool idea. However, I would have to try to parse the torrent data before I could tell if it was valid.
josh wrote:You should also have different exception classes for different scenarios, file not found is a LOT different than file invalid, and people might want to catch only one or the other
Can you show me an example or tutorial on how to do this? I am new to throwing exceptions.

EDIT| Also, my doList() method is throwing an extra array in there, for example the announce-list and [info][files][path] indices

I do believe some updating is in order

Re: PHP Torrent Decoder

Posted: Fri Oct 08, 2010 3:25 pm
by josh
s.dot wrote:Actually, the whole thing would have to be parsed to be able to get the name.. correct? I don't see a way to get information without parsing and decoding the entire torrent?
Well engineered binary file formats have ways to get at stuff quickly. There is typically information within the first few bytes that can be used to find out the start & end byte positions of the parts of the file you need. Then you can use fseek() and read just those bits out. Could be a performance thing some day, so why make it static now?
josh wrote:Also it should ideally have better defect localization. How about a way to validate it too?
What does defect localization mean?
A way to validate is a good idea!
[/quote]
Defect localization means being specific within your errors.

josh wrote:You should also have different exception classes for different scenarios, file not found is a LOT different than file invalid, and people might want to catch only one or the other
Can you show me an example or tutorial on how to do this? I am new to throwing exceptions.[/quote]

Code: Select all

// declare
class FileNotFoundException extends Exception {}
class InvalidFormat extends Exception {}
class NoFilesInFileListException extends Exception {}
class TrackerIsDownException extends Exception {}
class FilePermissionException extends Exception {}

// throw
switch(rand(1,5))
{
case 1: throw new FileNotFoundException(); break;
case 2: throw new InvalidFormatException(); break;
case 3: throw new NoFilesInFileListException; break;
// etc ///
}

// catch

try
{
  try
  {
   $yourCode->importantBusiness();
  }
  catch( FileNotFoundException $e )
  {
   // oh no our script has a bug
  }
catch( InvalidFormatException $e ) 
{
 // user error
}


Re: PHP Torrent Decoder

Posted: Sat Oct 09, 2010 3:00 am
by VladSun
josh wrote:

Code: Select all

try
{
  try
  {
   $yourCode->importantBusiness();
  }
  catch( FileNotFoundException $e )
  {
   // oh no our script has a bug
  }
catch( InvalidFormatException $e ) 
{
 // user error
}

I think it should be:

Code: Select all

try
{
	switch(rand(1,3))
	{
		case 1: throw new FileNotFoundException(); break;
		case 2: throw new InvalidFormatException(); break;
		case 3: throw new NoFilesInFileListException; break;
		// etc ///
	}
}
catch( FileNotFoundException $e )
{
	echo "FileNotFoundException";
}
catch( InvalidFormatException $e )
{
	echo "InvalidFormatException";
}
catch (Exception $e)
{
	echo "Unknown exception through";
}
There is no need of multiple and nested try blocks in this case (though I'm not absolutely sure what you've wanted to show).

Re: PHP Torrent Decoder

Posted: Sat Oct 09, 2010 3:58 am
by josh
Right, I was showing (or tried to show & failed) how to catch multiple exceptions and have one piece of handling code: http://stackoverflow.com/questions/1360 ... ns-at-once

Or trying to...

From the PHP manual comments jazfresh at hotmail.com
08-Aug-2006 05:18:
Sometimes you want a single catch() to catch multiple types of Exception. In a language like Python, you can specify multiple types in a catch(), but in PHP you can only specify one. This can be annoying when you want handle many different Exceptions with the same catch() block.

However, you can replicate the functionality somewhat, because catch(<classname> $var) will match the given <classname> *or any of it's sub-classes*.

For example:

<?php
class DisplayException extends Exception {};
class FileException extends Exception {};
class AccessControl extends FileException {}; // Sub-class of FileException
class IOError extends FileException {}; // Sub-class of FileException

try {
if(!is_readable($somefile))
throw new IOError("File is not readable!");
if(!user_has_access_to_file($someuser, $somefile))
throw new AccessControl("Permission denied!");
if(!display_file($somefile))
throw new DisplayException("Couldn't display file!");

} catch (FileException $e) {
// This block will catch FileException, AccessControl or IOError exceptions, but not Exceptions or DisplayExceptions.
echo "File error: ".$e->getMessage();
exit(1);
}
?>

Corollary: If you want to catch *any* exception, no matter what the type, just use "catch(Exception $var)", because all exceptions are sub-classes of the built-in Exception.