Page 1 of 1

Super Truncate

Posted: Wed Aug 09, 2006 3:02 pm
by Ollie Saunders
OK another string processing function for my framework.
This is a truncatation function that does just that little bit more than substr().

AC rightly cited that I wasn't using unit testing in my last post so this time I am! Its my first time. Thought it was pretty cool. This function turned out to be quite painful to get working properly but the unit testing helped quite a lot. I think I may of gone a little over the top, you know, one function, 53 tests; hmmmm, perhaps not.

Code: Select all

<?php
error_reporting(E_ALL);
require_once 'simpletest/unit_tester.php';
require_once 'simpletest/reporter.php';
/**
     * Returns a trimmed string never exceeding the maximum length.
     *
     * Specify a $truncMsg as a message to display to append to the output when
     * truncation has had to occur. If you specify a $truncMsg of true then
     * '...' will be used.
     *
     * Specify $avoidWordSplit as true and truncate will never split a word and
     * will instead truncate to the next available word.
     *
     * @param string $value
     * @param int $maxLength
     * @param bool|string $truncMsg
     * @param bool $avoidWordSplit
     * @return string
     */
    function truncate($value, $maxLength, $truncMsg = false, $avoidWordSplit = false)
    {
        if ($truncMsg === true) { // use default
            $truncMsg = '...';
        }
        $value = trim($value);
        if (strlen($value) <= $maxLength) { // no truncation necessary
            return $value;
        }
        if ($avoidWordSplit) {
            if (is_string($truncMsg)) {
                $truncMsgLen = strlen($truncMsg);
                if ($maxLength < $truncMsgLen) { // cannot fit in truncMsg
                    return '';
                }
                // take truncMsg into account to stay under maxLength
                $maxLength -= $truncMsgLen;
            }
            $lastChar = $value[$maxLength];
            $value = substr($value, 0, $maxLength);
            if (!ctype_space($lastChar)) {
                // we are not already at word break, have to find one further back
                $lastWord = max(array(strrpos($value, ' '),  strrpos($value, "\n"),
                                      strrpos($value, "\t"), strrpos($value, "\r")));
                $value = substr($value, 0, $lastWord);
            }
            if (is_string($truncMsg)) {
                return rtrim($value) . $truncMsg;
            }
            return rtrim($value);
        } else {
            if (is_string($truncMsg)) {
                $truncMsgLen = strlen($truncMsg);
                if ($maxLength < $truncMsgLen) { // cannot fit in truncMsg
                    return '';
                }
                // take truncMsg into account to stay under maxLength
                return rtrim(substr($value, 0, $maxLength - $truncMsgLen)) . $truncMsg;
            } else {
                return rtrim(substr($value, 0, $maxLength));
            }
        }
    }

class TestOfTruncate extends UnitTestCase
{
    public function test()
    {
        $test = array('Once upon a time in a land far fay away',
                      'Onceuponatimeinalandfarfaway');

        // test data 1
        $this->assertEqual(truncate($test[0], 29), 'Once upon a time in a land fa');
        $this->assertEqual(truncate($test[0], 30), 'Once upon a time in a land far');
        $this->assertEqual(truncate($test[0], 31), 'Once upon a time in a land far');
        $this->assertEqual(truncate($test[0], 3),  'Onc');
        $this->assertEqual(truncate($test[0], 2),  'On');
        $this->assertEqual(truncate($test[0], 6),  'Once u');
        $this->assertEqual(truncate($test[0], 29,  true), 'Once upon a time in a land...');
        $this->assertEqual(truncate($test[0], 30,  true), 'Once upon a time in a land...');
        $this->assertEqual(truncate($test[0], 31,  true), 'Once upon a time in a land f...');
        $this->assertEqual(truncate($test[0], 3,   true), '...');
        $this->assertEqual(truncate($test[0], 2,   true), '');
        $this->assertEqual(truncate($test[0], 6,   true), 'Onc...');
        $this->assertEqual(truncate($test[0], 29,  true,  true), 'Once upon a time in a land...');
        $this->assertEqual(truncate($test[0], 30,  true,  true), 'Once upon a time in a land...');
        $this->assertEqual(truncate($test[0], 31,  true,  true), 'Once upon a time in a land...');
        $this->assertEqual(truncate($test[0], 3,   true,  true), '...');
        $this->assertEqual(truncate($test[0], 2,   true,  true), '');
        $this->assertEqual(truncate($test[0], 6,   true,  true), '...');
        $this->assertEqual(truncate($test[0], 29,  false, true), 'Once upon a time in a land');
        $this->assertEqual(truncate($test[0], 30,  false, true), 'Once upon a time in a land far');
        $this->assertEqual(truncate($test[0], 31,  false, true), 'Once upon a time in a land far');
        $this->assertEqual(truncate($test[0], 3,   false, true), '');
        $this->assertEqual(truncate($test[0], 2,   false, true), '');
        $this->assertEqual(truncate($test[0], 6,   false, true), 'Once');

        // test data 2
        $this->assertEqual(truncate($test[1], 26), 'Onceuponatimeinalandfarfaw');
        $this->assertEqual(truncate($test[1], 27), 'Onceuponatimeinalandfarfawa');
        $this->assertEqual(truncate($test[1], 28), 'Onceuponatimeinalandfarfaway');
        $this->assertEqual(truncate($test[1], 3),  'Onc');
        $this->assertEqual(truncate($test[1], 2),  'On');
        $this->assertEqual(truncate($test[1], 6),  'Onceup');
        $this->assertEqual(truncate($test[1], 26,  true), 'Onceuponatimeinalandfar...');
        $this->assertEqual(truncate($test[1], 27,  true), 'Onceuponatimeinalandfarf...');
        $this->assertEqual(truncate($test[1], 28,  true), 'Onceuponatimeinalandfarfaway');
        $this->assertEqual(truncate($test[1], 3,   true), '...');
        $this->assertEqual(truncate($test[1], 2,   true), '');
        $this->assertEqual(truncate($test[1], 6,   true), 'Onc...');
        $this->assertEqual(truncate($test[1], 26,  true,  true), '...');
        $this->assertEqual(truncate($test[1], 27,  true,  true), '...');
        $this->assertEqual(truncate($test[1], 28,  true,  true), 'Onceuponatimeinalandfarfaway');
        $this->assertEqual(truncate($test[1], 3,   true,  true), '...');
        $this->assertEqual(truncate($test[1], 2,   true,  true), '');
        $this->assertEqual(truncate($test[1], 6,   true,  true), '...');
        $this->assertEqual(truncate($test[1], 26,  false, true), '');
        $this->assertEqual(truncate($test[1], 27,  false, true), '');
        $this->assertEqual(truncate($test[1], 28,  false, true), 'Onceuponatimeinalandfarfaway');
        $this->assertEqual(truncate($test[1], 3,   false, true), '');
        $this->assertEqual(truncate($test[1], 2,   false, true), '');
        $this->assertEqual(truncate($test[1], 6,   false, true), '');

        // custom $truncMsgs
        $sevenSevens = '7777777';
        $this->assertEqual(truncate($test[1], 30, $sevenSevens, false), 'Onceuponatimeinalandfarfaway');
        $this->assertEqual(truncate($test[0], 20, $sevenSevens, false), 'Once upon a t7777777');
        $this->assertEqual(truncate($test[0], 20, $sevenSevens, true),  'Once upon a7777777');
        $this->assertEqual(truncate($test[1], 20, $sevenSevens, true),  '7777777');
        $this->assertEqual(truncate($test[1], 6, $sevenSevens, true),   '');

    }
}

$test = new TestOfTruncate();
$test->run(new HtmlReporter());

Code: Select all

TestOfTruncate
1/1 test cases complete: 53 passes, 0 fails and 0 exceptions.

Posted: Wed Aug 09, 2006 3:18 pm
by Nathaniel
Wowee. 53 passes for one function. Hard to find fault with that. :)

You might try turning it into a class. Interface ideas:
$truncator =& new Truncator('a message!', 'a message to append to the truncated string');
$truncator->truncate(7);

Interface-wise, it's not all that great; but you could extract code chunks into methods. This would make the code a lot clearer.

Posted: Wed Aug 09, 2006 3:51 pm
by Ollie Saunders
Interface-wise, it's not all that great; but you could extract code chunks into methods. This would make the code a lot clearer.
Yeah I thought about separating it out into a couple of functions but when I thought of the names:
  • truncate
  • truncateToWord
  • truncateToWordMsg
  • truncateMsg
I decided I wouldn't bother it wouldn't be making anything any clearer. I think the interface is fine considering that as an alternative.

Posted: Wed Aug 09, 2006 4:08 pm
by Nathaniel
I agree, the interface is fine.

The code innards, however, could be improved by extracting methods. The code would do the same thing, but be easier to maintain. It is easiest to do this with classes.

If you want to keep your current interface, then I would recommend turning it into a class, and then making a function called truncate that behaves exactly as the one you have now. Instead of figuring the result itself, however, it would pass the work on to your class. The function would be a factory.

Posted: Wed Aug 09, 2006 4:18 pm
by Weirdan
ideas for additional tests:

Code: Select all

truncate('                   '); // truncation of string of spaces
truncate('ads           kjhdf'); // truncation does not touch the spaces inside the string (or does, since you didn't test for that, it's not clear from your test suit)
truncate('asd                '); // does it leave the trailing spaces
truncate('                asd'); // does it leave the leading spaces
truncate(''); // truncation of empty string
truncate(new stdClass); // truncation with invalid arguments

Posted: Wed Aug 09, 2006 4:54 pm
by Ollie Saunders
Weirdan, all of thoes would cause a parse error because you haven't specified the 2nd parameter but I've added them in and I can tell you exactly what it would do

Code: Select all

truncate('                   ', 100) == '';
truncate('                   ', 10) == '';
truncate('ads           kjhdf', 100) == 'ads           kjhdf';
truncate('ads           kjhdf', 10) == 'ads';
truncate('asd                ', 10) == 'asd';
truncate('                asd', 4) == 'asd';
truncate('') == '';
truncate(new stdClass, 20) == 'object id 1'; // or something like that

Posted: Wed Aug 09, 2006 9:02 pm
by Weirdan
I can tell you exactly what it would do
The question is not me not knowing what would that calls do, the question is 'Is it the behaviour you wanted?'
These were the edge cases you didn't test for.

Posted: Wed Aug 09, 2006 9:09 pm
by Ollie Saunders
The question is not me not knowing what would that calls do, the question is 'Is it the behaviour you wanted?
Yes
These were the edge cases you didn't test for.
That's becuase I know what trim() does.