PHP Developers Network

A community of PHP developers offering assistance, advice, discussion, and friendship.
 
Loading
It is currently Tue Jul 07, 2020 11:32 pm

All times are UTC - 5 hours




Post new topic Reply to topic  [ 6 posts ] 
Author Message
 Post subject: Word Count
PostPosted: Mon Apr 04, 2005 10:18 pm 
Offline
Forum Contributor
User avatar

Joined: Fri Jun 21, 2002 7:00 pm
Posts: 353
Location: Cleveland, OH
I was bored and wanted something that could give me some useful information about strings, files, etc. That's when the WordCount class was born. This is PHP5-specific code, so don't try it if you only have PHP4 installed. If you REALLY want, I can give you a PHP4 version. However, this is trivial and you can do it yourself if you are in the mood. Here it is:

Syntax: [ Download ] [ Hide ]
<?php

/**

 * The purpose of this class is to provide a mechanism for counting the number of characters,

 * words, lines, and maximum line length in either the contents of a file or a string. The

 * behavior is identical to that of the UNIX 'wc' program.

 *

 * Example usage:

 *

 * // Create the WordCount object

 * $wc = new WordCount();

 *

 * // Process a file on the filesystem

 * try {

 *     $wc->processString($file_name);

 *     // Use the getter methods to retrieve the counts

 * } catch (Exception $e) {

 *     // Handle the file exception here

 * }

 *

 * // Process data read in from standard input (STDIN)

 * try {

 *     // Read from standard input.

 *     $wc->processString('php://stdin');

 *     // Use the getter methods to retrieve the counts

 * } catch (Exception $e) {

 *     // Handle the file exception here

 * }

 *

 * // Process data from a string

 * $wc->processString($string);

 * // Use the getter methods to retrieve the counts

 *

 * @author Craig Slusher <cslusher@acm.org>

 * @version 1.0

 */


class WordCount

{

    private $character_count;

    private $word_count;

    private $line_count;

    private $max_line_length;

   

    /**

     * Create a new WordCount object with all counts reset to 0.

     *

     * @author Craig Slusher <cslusher@acm.org>

     * @version 1.0

     */


    public function __construct()

    {

        $this->resetCounts();

    }

   

    /**

     * Reset all of the counts.

     *

     * @author Craig Slusher <cslusher@acm.org>

     * @version 1.0

     */


    public function resetCounts()

    {

        $this->character_count = 0;

        $this->word_count = 0;

        $this->line_count = 0;

        $this->max_line_length = 0;

    }

   

    /**

     * Process the character, word, and line count for the contents of a file.

     *

     * @author Craig Slusher <cslusher@acm.org>

     * @version 1.0

     *

     * @param string $file_name The name of the file to process

     * @param bool $is_stdin True if the $file_name is 'php://stdin' (standard input), False otherwise

     *

     * @throw Exception

     */


    public function processFile($file_name)

    {

        // Only check for file existence if we are NOT reading from STDIN

        if (strcasecmp($file_name, 'php://stdin') != 0) {

            if (!file_exists($file_name)) {

                throw new Exception($file_name.': No such file or directory');

            }

        }

       

        $file_handle = @fopen($file_name, 'r');

        if ($file_handle === false) {

            throw new Exception($file_name.': Unable to read file');

        }

       

        // Reset the counts in case they haven't been reset already

        $this->resetCounts();

       

        // Read as much data as possible, but stop only when we get a \n

        $whole_line = '';

        while (!feof($file_handle)) {

            $data = fgets($file_handle, 4096);

            $whole_line .= $data;

           

            $strlen = strlen($data);

            $this->character_count += $strlen;

           

            // Use this to reference the last character in the string

            $strlen--;

           

            // We found a whole line, so let's update our counts

            if ($data{$strlen} == "\n") {

                $this->line_count++;

               

                // There is a new longest line

                if ($strlen > $this->max_line_length) {

                    $this->max_line_length = $strlen;

                }

               

                $this->word_count += str_word_count($whole_line);

                $whole_line = '';

            }

        }

    }

   

    /**

     * Process the character, word, and line count for the contents of a string.

     *

     * @author Craig Slusher <cslusher@acm.org>

     * @version 1.0

     *

     * @param string $string The string to process

     */


    public function processString($string)

    {

        $this->resetCounts();

       

        $this->character_count = strlen($string);

        $lines = explode("\n", $string);

        $this->line_count = count($lines) - 1;

       

        foreach ($lines as $line) {

            $strlen = strlen($line);

           

            // There is a new longest line

            if ($strlen > $this->max_line_length) {

                $this->max_line_length = $strlen;

            }

           

            $this->word_count += str_word_count($line);

        }

    }

   

    /**

     * Get the total number of bytes.

     *

     * @author Craig Slusher <cslusher@acm.org>

     * @version 1.0

     *

     * @return int The total number of bytes

     */


    public function getByteCount()

    {

        return $this->getCharacterCount();

    }

   

    /**

     * Get the total number of characters.

     *

     * @author Craig Slusher <cslusher@acm.org>

     * @version 1.0

     *

     * @return int The total number of characters

     */


    public function getCharacterCount()

    {

        return $this->character_count;

    }

   

    /**

     * Get the total number of words.

     *

     * @author Craig Slusher <cslusher@acm.org>

     * @version 1.0

     *

     * @return int The total number of words

     */


    public function getWordCount()

    {

        return $this->word_count;

    }

   

    /**

     * Get the total number of lines.

     *

     * @author Craig Slusher <cslusher@acm.org>

     * @version 1.0

     *

     * @return int The total number of lines

     */


    public function getLineCount()

    {

        return $this->line_count;

    }

   

    /**

     * Get the length of the longest line.

     *

     * @author Craig Slusher <cslusher@acm.org>

     * @version 1.0

     *

     * @return int The length of the longest line

     */


    public function getMaxLineLength()

    {

        return $this->max_line_length;

    }

}

?>


Last edited by protokol on Thu Apr 07, 2005 3:12 pm, edited 1 time in total.

Top
 Profile  
 
 Post subject:
PostPosted: Tue Apr 05, 2005 9:55 am 
Offline
Briney Mod
User avatar

Joined: Mon Jan 19, 2004 7:11 pm
Posts: 6446
Location: 53.01N x 112.48W

_________________
Real programmers don't comment their code. If it was hard to write, it should be hard to understand.


Top
 Profile  
 
 Post subject:
PostPosted: Wed Apr 06, 2005 5:36 pm 
Offline
Forum Contributor
User avatar

Joined: Fri Jun 21, 2002 7:00 pm
Posts: 353
Location: Cleveland, OH


Top
 Profile  
 
 Post subject:
PostPosted: Wed Apr 06, 2005 6:31 pm 
Offline
Briney Mod
User avatar

Joined: Mon Jan 19, 2004 7:11 pm
Posts: 6446
Location: 53.01N x 112.48W

_________________
Real programmers don't comment their code. If it was hard to write, it should be hard to understand.


Top
 Profile  
 
 Post subject:
PostPosted: Thu Apr 07, 2005 2:58 pm 
Offline
Moderator
User avatar

Joined: Mon Nov 03, 2003 7:13 pm
Posts: 5978
Location: Odessa, Ukraine
You might improve your class using instead of:
Syntax: [ Download ] [ Hide ]
//....

    $words = preg_split('/\s+/', trim($whole_line));

    $tot_words = count($words);                

    $this->word_count += $tot_words;

//....


Top
 Profile  
 
 Post subject:
PostPosted: Thu Apr 07, 2005 3:06 pm 
Offline
Forum Contributor
User avatar

Joined: Fri Jun 21, 2002 7:00 pm
Posts: 353
Location: Cleveland, OH
Nice tip. Thanks!

The code has been updated to reflect that change


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 6 posts ] 

All times are UTC - 5 hours


Who is online

Users browsing this forum: No registered users and 6 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Jump to:  
Powered by phpBB® Forum Software © phpBB Group