Chain Validation

Not for 'how-to' coding questions but PHP theory instead, this forum is here for those of us who wish to learn about design aspects of programming with PHP.

Moderator: General Moderators

User avatar
John Cartwright
Site Admin
Posts: 11470
Joined: Tue Dec 23, 2003 2:10 am
Location: Toronto
Contact:

Chain Validation

Post by John Cartwright »

As usual, I'm developing an application layer over the Zend Framework.. Now I've rewritten this a couple times and thought I would give my chain validation classes a clean slate. Now without getting into too much detail (I'd rather jump into code :D), I'm looking for both design input and code ideas. I'll outline specifically my concerns through the code.

So my first concern with chain validation is the syntax to which people use.

A couple things to keep in mind,
* Validator can be chained together or run independently
* Errors are hardcoded within each rule, although errors should be overridable
* Attachments can optionally accept paramaters

Code: Select all

$validator = new Northern_Validator('test1', $this->getInvokeArg('get'));
$validator->attach('foo', new Northern_Validator_Rule_NotNull(), 'field cannot be empty');
$validator->attach('foo', new Northern_Validator_Filter_Trim());		
$validator->attach('foo', new Northern_Validator_Flag_Optional());
$validator->dispatch();

if ($validator->isValid()) {

}

// or can be run as a chain

$validators[0] = new Northern_Validator('test1', $this->getInvokeArg('get'));
$validators[0]->attach('fieldname', new Northern_Validator_Rule_NotNull(), 'field cannot be empty');
$validators[0]->attach('fieldname', new Northern_Validator_Filter_Trim());		
$validators[0]->attach('fieldname', new Northern_Validator_Flag_Optional());

$validators[1] = new Northern_Validator('test2', $this->getInvokeArg('post'));
$validators[1]->attach('fieldname1', new Northern_Validator_Rule_NotNull(), 'field cannot be empty');
$validators[1]->attach('fieldname2', new Northern_Validator_Filter_Trim());		
$validators[1]->attach('fieldname2', new Northern_Validator_Flag_Optional());		
		
$chain = new Northern_Validator_Chain($validators);	

if ($chain->isValid()) {

}
A couple of concerns which come to mind.

*By having a single method attach() in the validator class, the class needs to use reflection to determine exactly what kind of object we are attaching. I don't really like this idea and usually avoid using reflection, but passing the object type to attach() seems even worse. Which do you prefer?

*If you'll notice, you are able to pass "flag" objects, such as Northern_Validator_Flag_Optional(), which would indicate to the validator to only perform the rule checking if certain requirements have been met. Such as only validate the field if the field has data in it. Now, I know the validator object needs to have some kind of optional setting, and I know I can create custom "flags", but again I'm kind of uncomfortable having a seperate object determining conditions.. Can anyone think of a better way to do this?

I'm only really concerned with the API, we'll worry about the internals in a bit. Mainly, I'm interested how others have tackled the API for (chain) validator classes
User avatar
Maugrim_The_Reaper
DevNet Master
Posts: 2704
Joined: Tue Nov 02, 2004 5:43 am
Location: Ireland

Post by Maugrim_The_Reaper »

*By having a single method attach() in the validator class, the class needs to use reflection to determine exactly what kind of object we are attaching. I don't really like this idea and usually avoid using reflection, but passing the object type to attach() seems even worse. Which do you prefer?
Isn't that a case of a loose interface? You seem to have three broad types: Filter, Flag and Rule. I'm not entirely sure how much Reflection this would need assuming they share an interface common to all Validator classes, or the attach() method differentiates between the three types to utilise each type's shared interface. Presumably you need to split types to manage run order unless it's dictated by the order in which attached.

Do you have a specific use case where Reflection becomes necessary? In general I find Reflection in PHP tends to be a sign of overcomplicating a solution in 90% of cases. It's also a common performance issue - though that's usually removed as a side effect of simplifying the design through refactoring, not as a premature optimisation.

You should check the Zend Framework mailing list and wiki for tips and ideas. Chains have been discussed in the past ever since they deprecated Zend_Filter_Input and made input filtering such a low level task.
User avatar
Luke
The Ninja Space Mod
Posts: 6424
Joined: Fri Aug 05, 2005 1:53 pm
Location: Paradise, CA

Post by Luke »

ever since they deprecated Zend_Filter_Input
When did this happen, and why? :?
User avatar
Christopher
Site Administrator
Posts: 13596
Joined: Wed Aug 25, 2004 7:54 pm
Location: New York, NY, US

Post by Christopher »

The major difference is that I don't mix filters and rules in the same chain. I am not sure what your flags do? I use separate chains for validation and filtering. I prefer separate chains for exactly the reasons you mention -- their needs are slightly different and I can make them ultra simple if they are separate. To me validation inspects values and builds up error messages. Filtration modifies values. Those are two very different things to me.
The Ninja Space Goat wrote:
ever since they deprecated Zend_Filter_Input
When did this happen, and why? :?
Because it was a giant, steaming, monolithic, procedural piece of crap! :)
(#10850)
User avatar
Maugrim_The_Reaper
DevNet Master
Posts: 2704
Joined: Tue Nov 02, 2004 5:43 am
Location: Ireland

Post by Maugrim_The_Reaper »

Because it was a giant, steaming, monolithic, procedural piece of crap!
Seconded ;). It was basically a function library with two identical layers for no good reason.

The problem is that the replacement though more OO, has no overlaying structure - so there are going to be a lot of people working up chain validators and such to make it easier to use. It would have been a lot better if they went and added something concrete like that but I guess the rush to 1.0 means we'll need to wait for 1.0.x, or (I hope not!) 1.1 as someone suggested.

Back on topic - ;)

I think Flags should be simple constants - does having them as classes pose any particular advantage? Filters should be attached and run as a priority since they adjust input to minimise user errors. I'd also check exactly how many filters are required. It's possible you could manage this internally by knowing what type of data is being received (i.e. trim as part of validation where a "clean" value is extracted but leaving the raw value untouched). Is there any other type of filtering you intend adding?

After this you're into validation which is the backbone of the chain. At this stage the order of Validators is unimportant though obviously it's a good point to put NotNull/Optional at the start since presumably it's the fastest and simplest one. Once you have a interface for Validators (as opposed to filter/flags) you're at the point where a helper class should be capable of manufacturing a chain. This is usually my preference - I absolutely dislike defining chains inside the source code since its so repetitive, so a Factory and a set of config files is where I aim.
User avatar
John Cartwright
Site Admin
Posts: 11470
Joined: Tue Dec 23, 2003 2:10 am
Location: Toronto
Contact:

Post by John Cartwright »

Maugrim_The_Reaper wrote:Isn't that a case of a loose interface? You seem to have three broad types: Filter, Flag and Rule. I'm not entirely sure how much Reflection this would need assuming they share an interface common to all Validator classes, or the attach() method differentiates between the three types to utilise each type's shared interface. Presumably you need to split types to manage run order unless it's dictated by the order in which attached.
Hit the nail on the head. And yes I need to split types to manage run order.
Maugrim_The_Reaper wrote:Do you have a specific use case where Reflection becomes necessary? In general I find Reflection in PHP tends to be a sign of overcomplicating a solution in 90% of cases. It's also a common performance issue - though that's usually removed as a side effect of simplifying the design through refactoring, not as a premature optimisation.
I also consider reflection a failure on my part in most instances, which usually reflects poor design; I definantly would like to see this go, although I don't see how I can refactor it out as of yet.

This is where I need reflection (during the attachment process)

Code: Select all

class Northern_Validator
{
	/**
	 * Todo
	*/	
	public function attach($field, $attachment, $error = false)
	{
		if (!$this->_fieldExists($field)) {
			throw new Northern_Validator_Exception('Cannot find field name "'. $field);
		}
	
		$type = $this->_getReflectionType($attachment);
		$this->fields[$field][$type] = array();
		array_push($this->fields[$field][$type], $attachment);
	}

	/**
	 * Todo
	*/		
	protected function _getReflectionType($attachment)
	{
		$type = new ReflectionClass($attachment);
		$parts = explode('_', ucfirst($type->getName()));	

		if (!in_array($parts[2], $this->types)) {
			throw new Northern_Validator_Exception('Must pass valid attachment type');
		}

		return $parts[2];
	}
Obviously this sucks because a) you have to follow the naming convention (Northern_Validator_Rule_NotNull(), Northern_Validator_Filter_Trim())),
b) reflection is used on such a trivial task.
Maugrim_The_Reaper wrote:You should check the Zend Framework mailing list and wiki for tips and ideas. Chains have been discussed in the past ever since they deprecated Zend_Filter_Input and made input filtering such a low level task.
I have looked at it awhile ago when the proposal was fresh.. I guess its been awhile. Thanks for the tip Maugrim.]]
arborint wrote:The major difference is that I don't mix filters and rules in the same chain. I am not sure what your flags do? I use separate chains for validation and filtering. I prefer separate chains for exactly the reasons you mention -- their needs are slightly different and I can make them ultra simple if they are separate. To me validation inspects values and builds up error messages. Filtration modifies values. Those are two very different things to me.
The Ninja Space Goat wrote:
ever since they deprecated Zend_Filter_Input
When did this happen, and why? :?
Because it was a giant, steaming, monolithic, procedural piece of crap! :)
Indeed, your filter and validator classes initially got me started down this path many months ago. I see where your coming from but like they say, if your zebra don't try and change your stripes.. or something to that effect. Reguardless, your right this does complicate things.. and now I am suffering for it :lol:
Maugrim_The_Reaper wrote:I think Flags should be simple constants - does having them as classes pose any particular advantage? Filters should be attached and run as a priority since they adjust input to minimise user errors. I'd also check exactly how many filters are required. It's possible you could manage this internally by knowing what type of data is being received (i.e. trim as part of validation where a "clean" value is extracted but leaving the raw value untouched). Is there any other type of filtering you intend adding?
The only reason I made them a class was because of the unforseen circumstances. I know last chain validator I always ran into shortfalls of the design. Looking back on this now, I think that idea was silly.. I like the idea of a constant, but again I would like some input as to what you guys would like to see in the API (not that you will be using this code.. probably already have your own filtering megahogs already finished :P)

Code: Select all

$validators[0]->attach('foo', new Northern_Validator_Rule_NotNull(), 'error message', OPTIONAL);
Or something to this effect? Seems like were starting to have a lot of paramaters being passed, cluttering the code. Hrm..

The idea of checking running the filters is actually the opposite I had in mine. Filters to me have a lot more responsibilities than one would imagine. For instance, lets say I'm doing a file uploading form.. I'd have something like

Code: Select all

$validators[0] = new Northern_Validator('fileupload', $this->getInvokeArg('files'));
$validators[0]->attach('foo', new Northern_Validator_Rule_Files_NotNull(), 'field cannot be empty');
$validators[0]->attach('foo', new Northern_Validator_Filter_Files_Trim());
$validators[0]->attach('foo', new Northern_Validator_Filter_Files_MoveFile($path1, $path2));	
//etc etc ...
Notice the MoveFile. If this is run prior to the rules then the thing would explode if the rules failed. I think I remember the Zend_Proposal using constants to set the priority of the order of attachments.
Maugrim_The_Reaper wrote:Is there any other type of filtering you intend adding?
Yes, all kinds of sorts. There will be about a dozen or so "standard" filters, and of course others can be added on a per application basis. This also brings me back to the reflection part, which I'd like to avoid completely so I can include filter objects from anywhere in the filesystem, not the the just the default directory.
Maugrim_The_Reaper wrote: After this you're into validation which is the backbone of the chain. At this stage the order of Validators is unimportant though obviously it's a good point to put NotNull/Optional at the start since presumably it's the fastest and simplest one. Once you have a interface for Validators (as opposed to filter/flags) you're at the point where a helper class should be capable of manufacturing a chain. This is usually my preference - I absolutely dislike defining chains inside the source code since its so repetitive, so a Factory and a set of config files is where I aim.
I think your going to have to explain this a bit more :oops: Kind of wen't over my head. You mean have all your validations predefined through a configuration file, read that configuration file, and finally factory and manufactor the chain?

But thanks alot for the replys so far guys, much appreciated :) Keep em coming.
User avatar
Maugrim_The_Reaper
DevNet Master
Posts: 2704
Joined: Tue Nov 02, 2004 5:43 am
Location: Ireland

Post by Maugrim_The_Reaper »

Maugrim_The_Reaper wrote:After this you're into validation which is the backbone of the chain. At this stage the order of Validators is unimportant though obviously it's a good point to put NotNull/Optional at the start since presumably it's the fastest and simplest one. Once you have a interface for Validators (as opposed to filter/flags) you're at the point where a helper class should be capable of manufacturing a chain. This is usually my preference - I absolutely dislike defining chains inside the source code since its so repetitive, so a Factory and a set of config files is where I aim.
I guess my point was that if the Validation/Filter system is sufficiently cohesive it should be possible to delegate a chain's construction to a Factory or similar using only a configuration file. I'd rate this as one of the design objectives that point to a truly clean interface - since it's simple to the point that it can be automated (no developer manual coding required). Check out (with a pinch or two of salt) the Symfony framework which does something a bit like this using YAML configuration files.

In case you're not familiar with YAML, it's one of those formats PHP developers seem to pass over in favour of XML or INI. Yet it has a perfect fit to certain tasks which are better left in plain text with minimal markup. A sample YAML file might look like (plagiarising this from Symfony's form validation example ;)):

Code: Select all

fields:
  name:
    required:
      msg:       The name field cannot be left blank
    sfStringValidator:
      min:       2
      min_error: This name is too short (2 characters minimum)
      max:       100
      max_error: This name is too long. (100 characters maximum)
Simple enough to understand even if not familiar with YAML. You can do the exact same in XML, but YAML IMO is more readable for simple configuration files. Main reason I haven't used it yet is because I don't have a personal open source PHP5 parser yet.

After that detour...
Jenk wrote:Obviously this sucks because a) you have to follow the naming convention (Northern_Validator_Rule_NotNull(), Northern_Validator_Filter_Trim())),
b) reflection is used on such a trivial task.
Why not split the class into the three types? If each has a slightly varying interface - will filters filter() or validate() or...? This (at a stretch since a use case is lacking) might suggest each class type would have varying Interfaces you can match using "instanceof". One advantage here is the possibility of shortening the class name - though in the automated scenario that matters far less. Failing that you can fall back on get_class() instead of Reflection ;). I'd suggest the Interfaces initially until it became obvious or certain it provided no benefit.
I have looked at it awhile ago when the proposal was fresh.. I guess its been awhile. Thanks for the tip Maugrim.
As consequence would have it, it was posted today that some solution is actually on a roadmap somewhere for post-1.0. That could still be months away of course :(. I know the 1.0 focus is worth the delay but I can't see anyone relying on the ZF until the next iteration 1.1 is churned out - it's way too manual in nature.
Jenk wrote:Or something to this effect? Seems like were starting to have a lot of paramaters being passed, cluttering the code. Hrm..
I'm trying to think if "Foo" should be an object itself - removes the repetitive parameter and moves the attach() method to a slightly lower place (per field). Optional, in the absence of a constant, could be a simple:

Code: Select all

$validators[0]->optional('foo');
As a flag it isn't really "attaching" which suggest pushing to a separate method. This, I think, negates needing to apply a NotNull rule - it's defunct if the field is optional to start with. Null should an implicit check anyway where a failure sparks verification of the "optional" property.

Is MoveFile part of validation though? Granted it's necessary, but it seems more of a post-validation step that can quickly get bogged down in the various file storage scenarios - to a directory/database/other?
User avatar
John Cartwright
Site Admin
Posts: 11470
Joined: Tue Dec 23, 2003 2:10 am
Location: Toronto
Contact:

Post by John Cartwright »

Thanks for the comments thus far.. I'll see how creating the string parser goes..

P.S. I don't think it was Jenk you were quoting :wink:
User avatar
Maugrim_The_Reaper
DevNet Master
Posts: 2704
Joined: Tue Nov 02, 2004 5:43 am
Location: Ireland

Post by Maugrim_The_Reaper »

My brain was still on the detour ;).
User avatar
John Cartwright
Site Admin
Posts: 11470
Joined: Tue Dec 23, 2003 2:10 am
Location: Toronto
Contact:

Post by John Cartwright »

I'm really really enjoying the automation of having yalm configuration settings to determine the validation. Although I do see some shortfalls of having the ruleset as a parsed ruleset and not coding directly, but I think the trade off is worth it.

I have come across a technical issue and I think I need to retrace myself a bit and rethink the design. The problem is, I have this one Validator rule, called Compare. It essentially will compare two field to make sure they are identical.. Well how I have things setup is the function below will loop each field, and pass the parameters associated with it. Heres an example:

Code: Select all

method:         get

fields:
  foo:
    validator: 
      compare:
        comparefield: bar
        compare_error: Your passwords do not match
Now I'm not sure if even showing you this is helpful, but I'm not even sure how to tackle this problem effectively. To the point, I need to be able to pass a fields value into another fields ruleset to perform the comparison. So in the case of my ruleset, I want to compare the field 'foo' and 'bar'.

How I set it up right now I simply check for the 'comparefield' attribute and fetch prepend it's value to another attribute called 'comparevalue'. This functionality definantly does not belong in the Northern_Validator class (this class) which manages the rulesets.

So now my question, how do you recommend I have the $source (a data transfer object containing $_POST, $_REQUEST, etc) available to all of the individual rule objects. I don't neccesarily want to pass the entire data transfer object, because every rule will then have to respecificy which field it belongs to.

Perhaps some sort of static function dynamically returning the data transfer object?

Code: Select all

class Northern_Validator
{

      //snip

       /**
	 * Process the ruleset of each individual field.
	 * Provide simple checks to make sure the ruleset has appropriate values.
	 * Upon each rule failure, add error message to error queue. 
	 *
	 * @access protected
	 * @throws Northern_Validator_Exception
	*/	
	protected function _disptachRuleset()
	{
		/**
		 * Sanity check to assure a yalm ruleset has been set
		*/
		if (empty($this->yaml) || !count($this->yaml)) {
			throw new Northern_Validator_Exception('Ruleset is empty');
		}

		/**
		 * Sanity check to assure we infact have a request method was set
		*/
		if (!isset($this->yaml['method'])) { 
			throw new Northern_Validator_Exception('Source method is not defined in yaml ruleset');
		}
		
		foreach ($this->yaml['fields'] as $fieldname => $attributes) { 
			/** 
			 * Reload source incase it was overrided 
			*/
			$source = $this->getData($this->yaml['method']);
			
			/** 
			 * Allow override of globally assigned method 
			*/
			if (isset($attributes['method'])) {
				$source = $this->getData($attributes['method']);			
			}
			
			if (isset($attributes['validator']) && count($attributes['validator'])) {
				foreach ($attributes['validator'] as $validator => $parameters) {
					if (!$this->_validatorExists($validator)) {
						throw new Northern_Validator_Exception('Unknown validator "'. (string)$validator .'" was used');
					}
					
					/**
					 * If our validation requires an additional field value, then supply it
					*/
					if (isset($parameters['comparefield'])) {
						$parameters['comparevalue'] = $source->get($parameters['comparefield']);
					}

					/** 
					 * Factory and process the validator, if the validator fails supply
					 * the error message in the yaml docucement
					*/
					$object = $this->_factoryValidator($validator, $parameters);
					if (!$object->isValid()) {
						$this->addError($fieldname, $object->getErrors());
					}
				}
			}
		}
	}

        //snip

}
User avatar
Christopher
Site Administrator
Posts: 13596
Joined: Wed Aug 25, 2004 7:54 pm
Location: New York, NY, US

Post by Christopher »

Jcart wrote:So now my question, how do you recommend I have the $source (a data transfer object containing $_POST, $_REQUEST, etc) available to all of the individual rule objects. I don't neccesarily want to pass the entire data transfer object, because every rule will then have to respecificy which field it belongs to.
One of the main reasons why I pass a container to the Validator, rather than a value, is for compare. You can do everything else but that. I recall that I figured out a way to do compare without it, but that idea got lost somewhere and I have never had the need to find it. ;)
(#10850)
User avatar
Maugrim_The_Reaper
DevNet Master
Posts: 2704
Joined: Tue Nov 02, 2004 5:43 am
Location: Ireland

Post by Maugrim_The_Reaper »

I'm really really enjoying the automation of having yalm configuration settings to determine the validation. Although I do see some shortfalls of having the ruleset as a parsed ruleset and not coding directly, but I think the trade off is worth it.
Glad you're finding it useful ;). Since it's a level of automation it's always going to be imperfect for some requirements so you'll find a few edge cases where some manual intervention is needed but hopefully not many or frequently. For most scalar values it works fine. I wonder if I should resurrect a proposal for a Zend_Config_Yaml...
Now I'm not sure if even showing you this is helpful, but I'm not even sure how to tackle this problem effectively. To the point, I need to be able to pass a fields value into another fields ruleset to perform the comparison. So in the case of my ruleset, I want to compare the field 'foo' and 'bar'.
The simplest solution is to give the ruleset access to the entire GET/POST container. Likely this introduces another (optional) parameter?, so you might need to work backwards to find a place to inject the container where it requires less fine grained passing so the default behaviour is that all rulesets have access to the full container, but accept a specific key when called. This does make rules aware of data they don't require, but that's a minor leakage of information which has little adverse impact when its bound to a strict interface.

I think that made sense - I may be in a vague mood today after the St. Paddys Day weekend ;).
User avatar
John Cartwright
Site Admin
Posts: 11470
Joined: Tue Dec 23, 2003 2:10 am
Location: Toronto
Contact:

Post by John Cartwright »

I forgot to ask in my last post, when loading each of the rules for each field, what do you think about having a simple check like this to determine whether to fetch another field. Simply adding the comparefield in the yaml document will trigger this.

Code: Select all

/**
                                         * If our validation requires an additional field value, then supply it
                                        */
                                        if (isset($parameters['comparefield'])) {
                                                $parameters['comparevalue'] = $source->get($parameters['comparefield']);
                                        }
And thanks for your comments thus far, I'll absorb everything and see what I can come up with.
And Maugrim, that Zend_Config_Yaml proposal would be fantastic.
User avatar
Maugrim_The_Reaper
DevNet Master
Posts: 2704
Joined: Tue Nov 02, 2004 5:43 am
Location: Ireland

Post by Maugrim_The_Reaper »

A simple check could work - so long as it's not manual intervention (automative) it should be fine if consistent.
And Maugrim, that Zend_Config_Yaml proposal would be fantastic.
I actually vaguely believed the ZF already had something for YAML so I did a little digging. I use YAML for quite a bit of configuration, and Ruby 1.8.0 has YAML via the Syck extension (there's a little known binding for PHP5 too). After checking on the ZF fw-general mailing list it looks like the closest the ZF came to YAML support was a suggestion back in the Summer that was never finished. I fired off a few emails to Matthew O'Phinney to see if a proposal would be worth making so I'll see what happens later in the week.

This is all your fault by the way...;). I can't deal with a non-configurable chain filter - I needs me YAML files!
User avatar
John Cartwright
Site Admin
Posts: 11470
Joined: Tue Dec 23, 2003 2:10 am
Location: Toronto
Contact:

Post by John Cartwright »

Well Maugrim, you have another loyal fan out there.
Post Reply