Page 1 of 1

{variable} replacer

Posted: Tue Aug 08, 2006 6:13 pm
by Ollie Saunders
OK this is a method of one of my classes. It converts input like this:

Code: Select all

An error occured in {label}
To

Code: Select all

An error occured in $this->label

Code: Select all

<?php
class A {
    const TOKEN_OPEN = '{';
    const TOKEN_CLOSE  = '}';

    public $label = '!yay!';
    public $tooltip = '!yay!';
    public $_numErrors = '!yay!';
    public $_id = '!yay!';

/**
     * Converts {variables} into the text they represent.
     *
     * No regular expressions because they are slow (and they are for n00bs) ;P
     *
     * @param string $msg
     */
    public function _expandErrorVars($msg)
    {
        static $errorVars = array('{label}'   => 'label',   '{numErrors}' => '_numErrors',
                                  '{tooltip}' => 'tooltip', '{id}'        => '_id');
        $replacements = array();
        $token = true;
        $posClose = null;
        $posOpen = $i = strpos($msg, self::TOKEN_OPEN);

        for ($q=0; $i !== false; $q++) {
            if (!$token) {
                if ($posClose) {
                    // open and close both found
                    $replacements[] = array($posOpen, $posClose);
                }
                // find another open
                $i = $posOpen = strpos($msg, self::TOKEN_OPEN, $i);
            } else {
                if (@ctype_alpha($msg[$i+1])) {
                    // face valid, already open find a close
                    $i = $posClose = strpos($msg, self::TOKEN_CLOSE, $i);
                } else {
                    // not face valid, force look for new open
                    $posClose = null;
                    $i++;
                }
            }
            $token = !$token;
            if ($q > 200000) {
                throw new OF_Exception('Critical overflow', OF_Exception::MISC);
            }
        }
        $expandedMsg = $msg;
        foreach ($replacements as $r) {
            list($start, $end) = $r;
            $var = substr($msg, $start, $end - $start + 1);

            if (isset($errorVars[$var])) {
                $expandedMsg = str_replace($var, $this->{$errorVars[$var]}, $expandedMsg);
            }
        }

        return $expandedMsg;
    }
}
$a = new A();
$tests = array('These are the variables {label} {tooltip} {numErrors} {id}',
               'This doesn\'t {exist}', '{{bob monkeys}',   '{bob monkeys',
               '{poordog {label}',      '}{label}',         '{label}',
               '{ label}',              '{Label}',          '{{{{label}}}}',
               '{ { { {* {a {label} }', '{label,}',         '{label}{');
foreach ($tests as $testStr) {
    echo $a->_expandErrorVars($testStr), "\n";
}

Code: Select all

These are the variables !yay! !yay! !yay! !yay!
This doesn't {exist}
{{bob monkeys}
{bob monkeys
{poordog {label}
}!yay!
!yay!
{ label}
{Label}
{{{!yay!}}}
{ { { {* {a {label} }
{label,}
!yay!{
I don't think I want to attempt to make "{poordog {label}" and "{ { { {* {a {label} }" work but if anybody has any smart solutions I'd be interested.

Posted: Tue Aug 08, 2006 9:29 pm
by Ambush Commander
Hmm... I was just about to recommend Regexps. I actually read an interesting article on this, with optimization and stuff. They tested str_replace versus preg_replace_callback with a variable amount of tokens and found that str_replace beat out preg_replace_callback when there were about four tokens, then preg_replace_callback won out. In this case, I don't know if a stack based parser would be faster, we'd have to benchmark.

Since this is code critique, let's do some critique.

1. No way to have literals in the text (that is, escape the braces). The escaping is usually the reason why you'd want a stack based parser.
2. Lack of formal unit tests: this makes it difficult to determine if a change broke something, you have to scan the entire output to figure it out
3. It looks like the container for this code was hacked out real quick. If it's doing something interesting and performance is not a super-huge concern, I'd recommend factoring it out to its own class (you'd probably only be running this routine once anyway).
4. Loop variables are not defined in documentation: $q, for instance, has nothing to do with the program logic but instead is the infinite-loop protection. Make that explicit
5. Since the first iteration starts of with $token == true, I would put that code first in the conditional. Saves you an exclamation mark too.

Praise:
1. Use of ctype_alpha. Good for you.
2. Infinite loop protection. Good.

I would assume that if we had an opening brace followed by material that is not allowed in tokens, I would simply drop the brace declaration and look for the next open one. So a consuming parser would do this:

1. { { { {* {a {label} } - next char is a space, invalid, reset to not in token
2. { { {* {a {label} } - same
3. { {* {a {label} } - same
4. {* {a {label} } - next char is an asterisk, no good
5. {a {label} } - now, here's how I would handle it differently. After hitting a bracket, I would use strcspn to consume all alphabetic characters. Then, I would test if the next character was a proper closing bracket, if not, ignore what we just did and keep on going.
6. {label} } - substitute
7. } - errant closing brace, nothing to do

Hmm... that's strange... you're still using str_replace. Why not substr_replace()

Posted: Tue Aug 08, 2006 10:19 pm
by Ollie Saunders
Thanks for your critque AC, very good criticisms methinks.
1. No way to have literals in the text (that is, escape the braces). The escaping is usually the reason why you'd want a stack based parser.
I don't think this is particularly necessary. The idea was that only the exact {variable} names would be replaced and you really aren't likely to use that in an error message, that's what this is for incidently.
2. Lack of formal unit tests: this makes it difficult to determine if a change broke something, you have to scan the entire output to figure it out
True. I could add one but I probably wont :P
3. It looks like the container for this code was hacked out real quick. If it's doing something interesting and performance is not a super-huge concern, I'd recommend factoring it out to its own class (you'd probably only be running this routine once anyway).
Yes it was a quick hack. Performance isn't a super-huge concern but I am very much aware of it. _exandErrorVars() is part of another class which I can post if you like I just put it in A to demonstate.
4. Loop variables are not defined in documentation: $q, for instance, has nothing to do with the program logic but instead is the infinite-loop protection. Make that explicit
Quite right, I'll do that.
5. Since the first iteration starts of with $token == true, I would put that code first in the conditional. Saves you an exclamation mark too.
Yes right also, I had a reason for having !$token first at some point though, that's my excuse anyway. ;)
Praise:
1. Use of ctype_alpha. Good for you.
2. Infinite loop protection. Good.
Thanks, I like praise :)
now, here's how I would handle it differently. After hitting a bracket, I would use strcspn to consume all alphabetic characters. Then, I would test if the next character was a proper closing bracket, if not, ignore what we just did and keep on going.
I will look into that.
Hmm... that's strange... you're still using str_replace. Why not substr_replace()
I wanted to use concat operators really but I couldn't figure out how to maintain the bits in between and after the {variables}, I didn't try particularly hard mind.

Now I think about it I could have just used str_replace() without any of the strpos, token stuff. Oh <span style='color:blue' title='I&#39;m naughty, are you naughty?'>smurf</span>.

Posted: Tue Aug 08, 2006 10:21 pm
by Ambush Commander
Heh. But if you've got a lot of tokens, that's a lot of str_replace calls.

Posted: Tue Aug 08, 2006 10:26 pm
by Ollie Saunders
I do this. Once every couple of weeks I spend ages writing something and then realise there was an obscenely easy way of doing it only once I have finished... *sigh*
Anyway here's the new version :D:D

Code: Select all

<?php
class A {
    const TOKEN_OPEN = '{';
    const TOKEN_CLOSE  = '}';

    public $label = '!yay!';
    public $tooltip = '!yay!';
    public $_numErrors = '!yay!';
    public $_id = '!yay!';

/**
     * Converts {variables} into the text they represent.
     *
     * No regular expressions because they are slow (and they are for n00bs) ;P
     *
     * @param string $msg
     */
    public function _expandErrorVars($msg)
    {
        $errorVars = array('{label}'     => $this->label,
                           '{numErrors}' => $this->_numErrors,
                           '{tooltip}'   => $this->tooltip,
                           '{id}'        => $this->_id);
        return str_replace(array_keys($errorVars),
                           array_values($errorVars),
                           $msg);
    }
}
$a = new A();
$tests = array('These are the variables {label} {tooltip} {numErrors} {id}',
               'This doesn\'t {exist}', '{{bob monkeys}',   '{bob monkeys',
               '{poordog {label}',      '}{label}',         '{label}',
               '{ label}',              '{Label}',          '{{{{label}}}}',
               '{ { { {* {a {label} }', '{label,}',         '{label}{');
foreach ($tests as $testStr) {
    echo $a->_expandErrorVars($testStr), "\n";
}

Code: Select all

These are the variables !yay! !yay! !yay! !yay!
This doesn't {exist}
{{bob monkeys}
{bob monkeys
{poordog !yay!
}!yay!
!yay!
{ label}
{Label}
{{{!yay!}}}
{ { { {* {a !yay! }
{label,}
!yay!{

Posted: Wed Aug 09, 2006 1:05 am
by Christopher
The generic version of what you built is a str_replace template class.

Code: Select all

<?php
class Template {
    public $token_open = '{';
    public $token_close  = '}';
    public $tokens = array();
    public $values = array();

    public function set($token, $value)
    {
        if ($token) {
             $this->tokens[] = $this->token_open . $token . $this->token_close;
             $this->values[] = $value;
        }
    }

    public function render($str)
    {
        return str_replace($this->tags, $this->values, $str);
    }
}