The Corner Stone of Security Data Validation [CLOSED]

Discussions of secure PHP coding. Security in software is important, so don't be afraid to ask. And when answering: be anal. Nitpick. No security vulnerability is too small.

Moderator: General Moderators

Post Reply
rufio1717
Forum Newbie
Posts: 13
Joined: Fri Feb 12, 2010 3:56 pm

The Corner Stone of Security Data Validation [CLOSED]

Post by rufio1717 »

My question: What are the downsides of using a data dictionary like the one I have built?
My question: What methods would you add to my validate class?

The validation of data is consider by many to be the corner stone of security. My proposed solution is to follow the ideas of the Validate classes of Tony Marston and Codeigniter. I liked the ideas of tony Marston (code is not up-to-date) and I like the code of codeigniter so I decided to combine them.

The idea is simple
1. Have a file which store information about every table + field in an array called schema. Assign attributes to each of the fields
2. Have a class that validates data before inserts and update to make sure that field attributes are valid
So you can do something like....

Code: Select all

$clean = $validate->pre_insert($data,'table_name')
if($clean) {
INSERT DATA;
}else{
echo $validate->get_errors();
}
**NOTE - I did not include a xss_clean method as it would have been too big. The best I have seen in Codeignter: $this->input->xss_clean($data); To add another like a xss_clean method simple extend the validate class and add the method.**

Okay Here we Go...

fielname: data_dictionary.php

Code: Select all

<?php
/*
* The schema array holds the table name , field name, and attributes
* attributes are seperated by a pipe
* you can use any native php function what expects one parameter
*
* example 1: $schema['table_name']['field_name'] = "attribute_1|attribute_2|attribute_3";
*
* example 2: $schema['users']['id'] = "whole_number|required|min_length[2]";
* example 3: $schema['users']['status'] = "enum['active','inactive','deleted']|min_length[5]|max_length";
* example 4: $schema['users']['name'] = "trim|strtolower|min_length[8]|max_length[14]|required"; 
* 
* ATTRIBUTE LIST
* ------------------
* required
* valid_float
* valid_number
* max_length[n]
* min_length[n]
* valid_datetime
* valid_datetime
* valid_email
* whole_number
* white_list
* enum[x1|x2|x3]  //do NOT encapsulate elements in quotes(' " ) each element is sepearted by a pipe |
* blacklist[pattern] the pattern must be a regular expression. If a match is found false is returned.
*/

 //example tables

$schema['users']['id'] = "whole_number|required|min_length[2]";
$schema['users']['status'] = "enum['active','inactive','deleted']|min_length[5]|max_length";
$schema['users']['name'] = "trim|strtolower|min_length[8]|max_length[14]|required"; 

$schema['file']['id'] =      "trim|min_length[8]|max_length[14]|required|whole_number"; 
$schema['file']['name'] = "min_length[8]|max_length[14]|required"; 
$schema['file']['path'] = " trim|strtolower|min_length[8]|max_length[32]|required|md5"; 

//end of file data_dictionary

Code: Select all

<?php 
//include the data diction contating the schema array
include("data_dictionary.php");
/**
* CLASS VALIDATE
*
* METHODS
* ---------------
* __construct
* pre_insert
* pre_update
* valid_entry
* valid_column
* valid_table 
* max_length
* min_length
* valid_datetime
* valid_email
* whole_number
* enum
* blacklist[pattern] 
*/

class validate
{
	private $schema; 
	private $clean = array();
	private $errors = array(); 
	public $error_prefix = '<p>';
	public $error_suffix = '</p>';
	
public function __construct($schema)  
{
	//load the schema
	$this->schema = $schema;
}

//-------------------------------------------------------------


/** 
* pre_insert this funciton is called before inserting data into the database
* @access public 
* @param array 
* @param string
* @return array 
*/

//-------------------------------------------------------------

public function pre_insert($input = array(),$table = NULL)
{
	$schema = $this->schema;
	$this->clean = $input; //don't worry we'll clean it
	
	$this->valid_table($table);

	foreach($schema[$table] as $field => $specs)
	{
		//dealing with isset rules
		if (!isset($input[$field]))
		{
			$rules = explode('|', $specs);
						
			if(in_array('isset',$rules,TRUE) || in_array('required',$rules))
			{
				//the value is empty and this is not allowed
				$this->errors[] = "The $field field is required";
			}
			continue; // skip to next interation 
			//we skip to the next interation to avoid the $input[$field] not defined error
			//and there is no need to check any rules since the value is empty
		}
			$this->valid_entry($field,$input[$field],$specs);
	}
	
	//check to see if there were any errors
	$total_errors = count($this->errors);
	
	if($total_errors == 0) return $this->clean;
	
	return FALSE;
	
}
//-------------------------------------------------------------


/** 
* pre_update this funciton is called before updatin data into the database
* notice the differnce between pre_insert
* @access public 
* @param array 
* @param string
* @return array
*/
//-------------------------------------------------------------



public function pre_update($input = array(),$table = NULL)
{
	$schema = $this->schema;
	$this->clean = $input; //don't worry we'll clean it
	
	$this->valid_table($table);

	foreach($input as $field => $value)
	{
			$this->valid_entry($field,$value,$schema[$table][$field]);
	}
	
	//check to see if there were any errors
	$total_errors = count($this->errors);
	
	if($total_errors == 0) return $this->clean;
	
	return FALSE;
}
//-------------------------------------------------------------


/**
* checks to see if the column exists and if all schema attributes are meet
* @access private
* @param string
* @return NULL
*/

public function valid_entry($key,$value,$fieldspecs)
{
	$rules = explode('|', $fieldspecs);
	//cycle through the rules
	foreach($rules as $rule):
		// Strip the parameter (if exists) from the rule
		// Rules can contain a parameter: max_length[5]
		$param = FALSE;
		if (preg_match("/(.*?)\[(.*?)\]/", $rule, $match))
		{
			$rule	= $match[1];
			$param	= $match[2];
		}

		
		if(!method_exists($this,$rule))
		{
			//validator doesn't contain the rule
			//check to see if there is a native php function
			if(function_exists($rule))
			{
				$this->clean[$key] = $rule($value); 
			}
		
		}
		else
		{
			$result  = $this->$rule($value,$param);
			
			//the test falied so we build the error message
			//REPLACE - This should be replace with an lang file
			if($result === FALSE)
			{
				$this->errors[] = "The \"$key\" field cannot have a  of value \"$value\" because rule \"$rule\" disallows this";
			}
		}
			
	endforeach; // end rule cycle
	
}
// --------------------------------------------------------------------

/**
* checks to see if the column exists
* @access public
* @param string
* @return boolean
*/
public function valid_column($field,$table)
{
	$schema = $this->schema;
	if(isset($schema[$table][$field])) return TRUE;
	return FALSE;
}
//-------------------------------------------------------------



/**
* checks to see if the table exists
* @author Stefan Gehrig stackoverflow.com
* @access public
* @param string
* @return boolean
*/
public function valid_table($table)
{
	$schema = $this->schema;
	
	if(isset($schema[$table])) return TRUE;
	
		printf('Table "%s" was not found', $table);
		die();
}
//-------------------------------------------------------------


/**
* get errors
* @access public
* @return array
*/
public function errors()
{
	$errors = $this->errors;
	
	if(count($errors) > 0)
	{
		$error_message = '';
		foreach($errors as $error)
		{
			$error_message .= $this->error_prefix.$error.$this->error_suffix;
		}		
		return $error_message;
	}

	return NULL;
}
//-------------------------------------------------------------
	
/**
* Set The Error Delimiter
*
* Permits a prefix/suffix to be added to each error message
* @author   http://codeigniter.com
* @access	public
* @param	string
* @param	string
* @return	void
*/	
function set_error_delimiters($prefix = '<p>', $suffix = '</p>')
{
	$this->error_prefix = $prefix;
	$this->error_suffix = $suffix;
}	


// --------------------------------------------------------------------


/**
* Begin validation helpers
* These functions that are used internal to split up the code
*
*/


/** 
* Checks to see if its larger than the maximum allowed length
* @access public 
* @param string
* @param string 
* @return boolean 
*/ 
public function max_length($str,$max_length)
{
	if(strlen($str) > $max_length) return FALSE;
	return TRUE;
}
//-------------------------------------------------------------


/** 
* Checks to see if its larger than the maximum allowed length
* @access public 
* @param string 
* @return boolean 
*/ 
public function min_length($str,$min_length)
{
	if(strlen($str) < $min_length) return FALSE;
	return TRUE;
}
//-------------------------------------------------------------


/** 
* Checks to see if in valid datime formation
* @author mk - stackoverflow.com member
* @access public 
* @param string 
* @return boolean 
*/ 

public function valid_datetime($date_time)
	{
	if (preg_match("/^(\d{4})-(\d{2})-(\d{2}) ([01][0-9]|2[0-3]):([0-5][0-9]):([0-5][0-9])$/", $dateTime, $matches)) {
		if (checkdate($matches[2], $matches[3], $matches[1])) {
			return true;
		}
	}
	
	return false;
}
//-------------------------------------------------------------


/** 
* Checks to see if in valid email
* @author UNKNOWN
* @access public 
* @param string 
* @return boolean 
*/ 
public function valid_email($email)
{
   $isValid = true;
   $atIndex = strrpos($email, "@");
   if (is_bool($atIndex) && !$atIndex)
   {
      $isValid = false;
   }
   else
   {
      $domain = substr($email, $atIndex+1);
      $local = substr($email, 0, $atIndex);
      $localLen = strlen($local);
      $domainLen = strlen($domain);
      if ($localLen < 1 || $localLen > 64)
      {
         // local part length exceeded
         $isValid = false;
      }
      else if ($domainLen < 1 || $domainLen > 255)
      {
         // domain part length exceeded
         $isValid = false;
      }
      else if ($local[0] == '.' || $local[$localLen-1] == '.')
      {
         // local part starts or ends with '.'
         $isValid = false;
      }
      else if (preg_match('/\\.\\./', $local))
      {
         // local part has two consecutive dots
         $isValid = false;
      }
      else if (!preg_match('/^[A-Za-z0-9\\-\\.]+$/', $domain))
      {
         // character not valid in domain part
         $isValid = false;
      }
      else if (preg_match('/\\.\\./', $domain))
      {
         // domain part has two consecutive dots
         $isValid = false;
      }
      else if
(!preg_match('/^(\\\\.|[A-Za-z0-9!#%&`_=\\/$\'*+?^{}|~.-])+$/',str_replace("\\\\","",$local)))
      {
         // character not valid in local part unless 
         // local part is quoted
         if (!preg_match('/^"(\\\\"|[^"])+"$/',
             str_replace("\\\\","",$local)))
         {
            $isValid = false;
         }
      }
      if ($isValid && !(checkdnsrr($domain,"MX") || checkdnsrr($domain,"A")))
      {
         // domain not found in DNS
         $isValid = false;
      }
   }
   return $isValid;
}
//-------------------------------------------------------------



/** 
* uses reqular expressions to check data 
* @access public 
* @param string 
* @return string 
*/
public function whole_number($int)
{
	if(is_numeric($int)===TRUE && (int)$int == $int)
	{
		return TRUE;
	}
	return FALSE;
}
//-------------------------------------------------------------



/** 
* uses reqular expressions to check data 
* @access public 
* @param string 
* @return string 
*/
public function white_list($pattern,$str)
{
	if(preg_match($pattern,$str) == 0)return TRUE;
	
}


//-------------------------------------------------------------

/** 
* check to see if the string is in the set of enums
* @access public 
* @param string 
* @return string 
*/

public function enum($str,$enum)
{
	$enum_array = explode('|',$enum);
	if(in_array($str,$enum_array)) return TRUE;
	return FALSE;
}
//-------------------------------------------------------------

/** 
* check to see if the string is in the set of enums
* @access public 
* @param string 
* @return string 
*/

public function blacklist($str,$pattern)
{
	if(preg_match($pattern,$str)== 0) return TRUE;
	return FALSE;
}
//-------------------------------------------------------------
}/*end of class validate*/

Edits:
pre_insert and pre_update now return arrays
Last edited by rufio1717 on Wed Jul 07, 2010 2:17 pm, edited 2 times in total.
User avatar
kaisellgren
DevNet Resident
Posts: 1675
Joined: Sat Jan 07, 2006 5:52 am
Location: Lahti, Finland.

Re: The Corner Stone of Securty Data Validation

Post by kaisellgren »

That looks complicated. You are better off with HTMLPurifier than xss_clean(), which is not very good for security.
rufio1717
Forum Newbie
Posts: 13
Joined: Fri Feb 12, 2010 3:56 pm

Re: The Corner Stone of Securty Data Validation

Post by rufio1717 »

kaisellgren wrote:That looks complicated. You are better off with HTMLPurifier than xss_clean(), which is not very good for security.

HTMLPurifier is not a complete solution. I actually planned on incorporating HTMLPurifier into the Validate class, but decided xss_clean was just a lot simpler to implement. I will after to reevaluate that decision. A white list approach is the preferred strategy of data validation http://www.owasp.org/index.php/Data_Validation. The point of the Validate class is to allow the addition of methods such as, valid_areacode or alpha_numeric. Also I am interested in what looks so complicated? The schema is just an array and the Validate class is pretty darn simple as well. The only method that does any real working is the valid_entry() method. I would really appreciate some feedback. Thanks
User avatar
kaisellgren
DevNet Resident
Posts: 1675
Joined: Sat Jan 07, 2006 5:52 am
Location: Lahti, Finland.

Re: The Corner Stone of Securty Data Validation

Post by kaisellgren »

rufio1717 wrote:HTMLPurifier is not a complete solution.
I am not sure what you meant, but it does things a lot further than other libraries I have ever encountered.
rufio1717 wrote:A white list approach is the preferred strategy of data validation http://www.owasp.org/index.php/Data_Validation.
True, and HTMLPurifier uses the white listing approach whereas xss_clean of CodeIgniter's Input class uses the black listing approach.
rufio1717 wrote:Also I am interested in what looks so complicated? The schema is just an array and the Validate class is pretty darn simple as well. The only method that does any real working is the valid_entry() method. I would really appreciate some feedback. Thanks
Well I guess complexity is relative, I just had an initial feeling of it being complicated when I read code like this:

Code: Select all

$schema['file']['path'] = " trim|strtolower|min_length[8]|max_length[32]|required|md5"; 
and trying to understand what it really does. For example, what does that md5 do? Will it hash input? But isn't it about validation... and looking at the email validation method with numerous if's in a row makes me suspicious about simplicity.

I suggest you to create some unit tests for your project. I personally like PHPUnit the most. Bugs in a security class are not welcome, which you probably already knew.
rufio1717
Forum Newbie
Posts: 13
Joined: Fri Feb 12, 2010 3:56 pm

Re: The Corner Stone of Securty Data Validation

Post by rufio1717 »

* The schema array holds the table name , field name, and attributes
* attributes are seperated by a pipe
* you can use any native php function what expects one parameter
md5 was just in there to indicate that you can use native php function.

Thank you for all your input. I'm going to close this thread. I will refine it until its production ready and then write a wiki.
Post Reply