Page 1 of 1
String Value Object for Security.
Posted: Sun Apr 29, 2007 4:01 pm
by Ollie Saunders
mods: this is not a duplicate post, I have deleted the original as code critique was the correct board.
I've been toying with the idea of using the
value object pattern to ensure my strings are always escaped when they should be but never twice. The problem I find is that escaping at the last possible moment e.g. view layer of MVC means you can't preprocess anything with HTML formatting and escaping as early as possible means length calculations and the like are out of wack. I don't particularly want to decode HTML entities because I consider this an expensive action that can be avoided so my solution is something where tracking is possible. I'm not really sure what the implications/limitations of this are so comment away.
Outputs:
Code:
Code: Select all
<?php
class Str
{
private $_string = '';
private $_destination;
public function __construct($stringData, Destination_Interface $destination)
{
$this->_string = $stringData;
$this->_destination = $destination;
}
public function __tostring()
{
return $this->_destination->clean($this);
}
public function length()
{
return strlen($this->_string);
}
public function getRaw()
{
return $this->_string;
}
public function setRaw($raw)
{
$this->_string = $raw;
}
public function append(self $str)
{
$new = clone $this;
$new->setRaw($this->getRaw() . $str->getRaw());
return $new;
}
}
Interface Destination_Interface
{
function clean(Str $str);
}
class HtmlDest implements Destination_Interface
{
public function clean(Str $str)
{
return htmlspecialchars($str->getRaw(), ENT_QUOTES);
}
}
$a = new Str('<a>', new HtmlDest());
$b = new Str('<b>k', new HtmlDest());
echo $a->append($b);
Posted: Sun Apr 29, 2007 6:15 pm
by Ambush Commander
Having a class that represents strings sets off warning bells in my head, I think the reason is because they offer abstraction at too fine a granularity, which causes performance problems and makes them clunky to use. While your length concerns are well-founded, I don't think creating a string class will solve that problem.
I will stew about this a little and get back to you on it.
Posted: Mon Apr 30, 2007 3:07 am
by Ollie Saunders
Those are precisely the reasons I wanted some opinion on this.
I will stew about this a little and get back to you on it.
Great.
Posted: Mon Apr 30, 2007 5:14 am
by Maugrim_The_Reaper
You need to remove the setRaw()/append() method for starters - a Value Object represents a specific value with a specific type (i.e. the class type). Since the Value is immutable, the class should be immutable also - i.e. you can't change the data once the object is constructed.
If you need that functionality, then the pattern isn't a Value Object - but more a generic Container for accumulating String values.
On the when of escaping - it always occurs as late as possible. Because of this the unescaped version should be available up to the last moment. I don't see any problem here - if the timing is correct then you apply escaping to the raw data if required easily enough. Using a container object to package all this just seems to be patching the symptom, not the underlying problem.
Posted: Mon Apr 30, 2007 12:06 pm
by Begby
I don't think that PHP is ready for a class like this. It would be incredibly useful if operator and function overloading were allowed, but I have a problem if the implementation of using strings within code is not standardized (i.e. you have to call a bunch methods on it).
If you could do this in php it would be great
Code: Select all
$str1 = new String() ;
$str2 = new String() ;
// This is not gay
// overloaded '=' operator
$str1 = 'My Name is' ;
$str2 = 'fred' ;
// overloaded .= operator
$str1 .= ' '.$str2 ;
// This is gay
$str1.setRaw('My Name is') ;
$str2.setRaw('fred') ;
$str1.append(' '.$str2->toString()) ;
Posted: Mon Apr 30, 2007 1:42 pm
by Ollie Saunders
Thanks for the comments guys.
You need to remove the setRaw()/append() method for starters - a Value Object represents a specific value with a specific type (i.e. the class type). Since the Value is immutable, the class should be immutable also - i.e. you can't change the data once the object is constructed.
What exactly is wrong with these? Esp. with append where I return a new object in order to avoid "side effects". And while we're at it what exactly what is so bad about side effects? Do you know any examples where there is one version that uses side-effects and another that does not
If you need that functionality, then the pattern isn't a Value Object - but more a generic Container for accumulating String values.
I tried to find some documentation on the value object pattern online but I couldn't find nothing of real use and I'd left my PHP Patterns Book at work, although I'm slowly beginning to lose faith and interest in that book anyway. So if you can point me to a source of information on the value object pattern I would appreciate it. That said, I have no self imposed rule about obeying the pattern so if it works I'll do it.
On the when of escaping - it always occurs as late as possible.
It does or it should? The problem with late escaping is that you cannot pre-process strings with HTML because they will end up escaped later and when most PHP applications depend on this very ability it becomes a bit of a problem.
I don't see any problem here - if the timing is correct then you apply escaping to the raw data if required easily enough.
Well there are more reasons for a class like this as well. I mean with a class like this + some type hinting you can be sure that things are secure by default, you won't ever forget to escape something. Also doesn't it make OO sense to group these things together/
Using a container object to package all this just seems to be patching the symptom, not the underlying problem.
I don't see what you mean. I'm going to have to ask you to qualify that in a lot more detail.
I don't think that PHP is ready for a class like this
Well I'm not going to wait until it is. This isn't PHP's style, so I don't think operating overloading will ever happen. C++ is the language for this sort of thing. In fact it seems like C++ was specifically designed for the value object pattern in places. Many people disagree with operator overloading in principle and I can see their point - it can definitely be abused.
Posted: Wed May 02, 2007 10:52 am
by Ollie Saunders
Anyone had any new thoughts?
Posted: Wed May 02, 2007 7:14 pm
by Z3RO21
I like the theory behind this class but I agree with out the ability to overload operators it might be more hassle in the end than it is worth.
Posted: Wed May 02, 2007 7:51 pm
by Weirdan
Anyone had any new thoughts?
It would be great if destination interface was not fixed on string value creation but was dependent on the context the value was used in. Though I don't see how it would be possible in php.
Posted: Wed May 02, 2007 9:09 pm
by Begby
If you are going to use a class for it, I don't think it should be setup with delegates. What if I wanted to take a string from a form input, then on the next page both display it in HTML and insert it into a db? I would have to either create two string objects, each with a different delegate, or switch the delegate.
Code: Select all
$str = new Str( $_POST['input'], new HTML_Destination() ) ;
$dbStr = new Str( $_POST['input'], new DB_Destination() ) ;
$this->view()->set('str', $str) ;
$model->str = $dbStr ;
$model->insert() ;
// Or do this
$str = new Str( $_POST['input'], new HTML_Destination() ) ;
$this->view()->set('str', $str) ;
$str->setDestination( new DB_Destination() ) ;
$model->str = $str ;
$model->insert() ;
To me this is just too much code to solve a relatively simple problem.
Here are some ideas I came up with...
Code: Select all
$str = new Str( $_POST['input'] ) ;
$this->view->set('str', $str->escape_html()) ;
$model->str = $str->escape_mysql() ;
$model->insert() ;
// maybe make it work like this, this could be accomplished by always storing a copy of the original string, then have a
// holder for an escaped string, toString() would return what was escaped, unescape() would restore the original string
$str = new Str(' ') ;
$str->escape_html() ;
echo $str ; //
$str->escape_mysql() ;
echo $str ; //  \;
$str->unescape() ;
echo $str ; //
// Maybe have it be pluggable by overloading __call(), you would need to set this up
// so you could register the plugins at load time instead of in your logic
interface IStr_Escape {....}
class Str_Escape_HTML implements IStr_Escape {...}
class Str_Escape_Custom implements IStr_Escape {...}
$str = new Str(' ') ;
// This would need to magically happen in some config somewhere
$str->register_escape('HTML', new Str_Escape_HTML() ) ;
$str->register_escape('Custom', new Str_Escape_Custom() ) ;
$str->escape_HTML() ;
$str->escape_Custom() ;
One reason that I would like to see a non-crappy string class though is mainly because of code insight in Zend and Eclipse. Its amazing what a time saver that is not having to remember a plethora of functions that are not named consistently (and god do I hate the inconsistency of php functions). I go out of my way to put things in methods just because of that. I even make registries have a method for each of the common items I put in the registry so that I can get the type inline and proper code hints.
A float or integer class would be killer too
Code: Select all
// I would use this, I don't care if its redundant or slow, it would just save me a lot of time and is very readable
$str = new Str('Fred') ;
$str->length() ;
$newStr = $str->substr(2,3) ;
$lower = $str->toLower() ;
$upper = $lower->toUpper() ;
$upper->escape_HTML() ;
// this might be handy if it was intuitive enough
$float = new float( 5.329 ) ;
$int = $float->ceil() ;
$int2 = $float->floor() ;
$float = $int->div($int2) ;
Posted: Thu May 03, 2007 4:06 am
by stereofrog
Posted: Thu May 03, 2007 6:18 am
by Ollie Saunders
Hmm a lot of people are quite against this. This doesn't surprise me, I was and am still cautious about the implications of this.
But I think I'm going to suck it and see. I've got a small(ish) project I need to do that involves a lot of string manipulation and we'll see how it fairs with that.
I may try those float and integer objects as well actually.
Posted: Fri May 04, 2007 4:44 am
by Ollie Saunders
I've had a few thoughts and actually I don't think this is going to work. Consider the following use case:
- Get some data from the database. It's tainted.
- We want to format it to make one half bold and one half normal. HTML has to be used to do this. If the data was "Foo&Bar" the result we are after would be:
- The tags must be maintained but the text between escaped.
The Str object is only going to be capable of giving you
or
never the desired data (with the ampersand escaped).
So what is the solution? DOM possibly?
Posted: Fri May 04, 2007 7:13 am
by kyberfabrikken
ole wrote:So what is the solution? DOM possibly?
I sound to me like you're formatting your data at the wrong point in your application. The model layer should return semantic data, while it's up to the view layer to format this into a representation (Usually HTML).
Posted: Fri May 04, 2007 8:02 am
by Ollie Saunders
Yeah, I suppose given that, this isn't really necessary.
btw your framework looks cool.