Page 1 of 1

Validating a filename

Posted: Sat Mar 03, 2007 8:53 pm
by Ambush Commander
Quite frankly, I'm scared of this function. Resting upon it is the entire integrity of every file on my filesystem. The last time I had to implement something like this, I went for a pure whitelist.

But, anyway, is this secure?

Code: Select all

/**
 * Validates a page name, making sure that is allowed and in proper format;
 * halts page execution if invalid
 * @param $page string page name to validate
 * @param $allowed_dirs lookup array of allowed directories, see config.default.php
 * @param $filename_chars PCRE style string of characters, see config.default.php
 */
function validate_page($page, $allowed_dirs, $filename_chars) {
    // validate the path, perhaps syntax could be more permissive
    $regex = "#(((?:[$filename_chars]+/)*)[$filename_chars]+).html#";
    $status = preg_match($regex, $page, $matches);
    if (!$status) display_error_and_quit(403);
    
    // validate directory
    if (!isset($allowed_dirs[$matches[2]])) {
        // maybe one of its parent directories had a recursive declaration
        $dir = $matches[2];
        $dirs = explode('/', $dir);
        $test_dir = '';
        $ok = false;
        foreach ($dirs as $name) {
            if ($name === '') break;
            if (
                isset($allowed_dirs[$test_dir]) &&
                $allowed_dirs[$test_dir] === 1
            ) {
                $ok = true;
                break;
            } else {
                $test_dir .= $name . '/';
            }
        }
        if (!$ok) display_error_and_quit(403);
    }
}
If this function passes, the resulting $file will be accessed and its contents output. Typical call looks like:

Code: Select all

validate_page($_GET['f'], array('' => 0, 'data/' => 1), 'a-zA-Z0-9\-_');

Posted: Sun Mar 04, 2007 4:38 am
by Mordred
I think you should consider another approach.
Use realpath() on the name, and then pathinfo() to parse it. Check if the $allowed_dirs exist.
Usually homebrewed solutions (when library variants exist) are not very successfull :)

Yours in particular is (theoretically) flawed: allowed_dir/../evil_dir/passwords.html
Of course you disallow the dot in $filename_chars, but if a user wants to have directories with . in them (which is a valid reason), then the above will suddenly begin to work. The way to deal with .. is realpath, not disallowing of dots.

Actually, if you have a symlink to another dir, the "exploit" will work even without dots. (with realpath, again, you're safe)

Posted: Sun Mar 04, 2007 12:07 pm
by Ambush Commander
Use realpath() on the name, and then pathinfo() to parse it. Check if the $allowed_dirs exist.
I previously tried implementing the function with realpath, but hit some snags. Namely, realpath fails when the file doesn't exist, and I cannot guarantee that $page exists (in fact, it probably won't exist). Any tricks around this? I suppose I could check for the corresponding source file, but that would require some major refactoring.
Yours in particular is (theoretically) flawed: allowed_dir/../evil_dir/passwords.html
Of course you disallow the dot in $filename_chars, but if a user wants to have directories with . in them (which is a valid reason), then the above will suddenly begin to work. The way to deal with .. is realpath, not disallowing of dots.
You're right. I think I'll add checks against double-dots in the directory check, that should fix it. Given the directory checks, theoretically speaking, a double dot attack would only be possible when you've recursively allowed a directory.
Actually, if you have a symlink to another dir, the "exploit" will work even without dots. (with realpath, again, you're safe)
I guess I could hope that the user was smart enough not to put any symlinks in the directory. :-/ Not good.

Posted: Sun Mar 04, 2007 1:27 pm
by WaldoMonster
Ambush Commander wrote:
Use realpath() on the name, and then pathinfo() to parse it. Check if the $allowed_dirs exist.
I previously tried implementing the function with realpath, but hit some snags. Namely, realpath fails when the file doesn't exist, and I cannot guarantee that $page exists (in fact, it probably won't exist). Any tricks around this? I suppose I could check for the corresponding source file, but that would require some major refactoring.

Code: Select all

$dir = @realpath($dir) or display_error_and_quit(403);

Posted: Sun Mar 04, 2007 1:55 pm
by Ambush Commander
WaldoMonster wrote:

Code: Select all

$dir = @realpath($dir) or display_error_and_quit(403);
Actually, that's pretty good. Don't realpath the filename: realpath the directory. If the directory doesn't exist, I have no qualms of scratching it. After I realpath, though, I still need to check it against the allowed dirs array, which is of relative paths. Maybe realpath those too?

Posted: Sun Mar 04, 2007 6:09 pm
by feyd
yep, realpath them too. :)

Posted: Sun Mar 04, 2007 6:27 pm
by Ambush Commander
Just discovered a really huge major security hole in the function. The regexp actually doesn't properly check the string: there's no beginning and end anchors in the expression!

Posted: Sun Mar 04, 2007 6:46 pm
by feyd
Quite true. :)

Posted: Mon Mar 05, 2007 2:49 am
by Mordred
Ambush Commander wrote:Just discovered a really huge major security hole in the function. The regexp actually doesn't properly check the string: there's no beginning and end anchors in the expression!
LOL, so much for our audit super-powers ;)

Posted: Mon Mar 05, 2007 2:04 pm
by Ambush Commander
Here's the revised code, all neatly OOPed out. A bit large though...

Code: Select all

public function __construct($path) {
        $xc = XHTMLCompiler::getInstance();
        // test file extension, must be html or xhtml
        $ext = strrchr($path, '.');
        if ($ext !== $this->sourceExt && $ext !== $this->cacheExt) {
            return display_error_and_quit(403);
        }
        $dir = dirname($path);
        // test for directory's existence, simultaneously resolve
        // to full path
        $dir = $xc->realpath($dir);
        if ($dir === false) return display_error_and_quit(403);
        $allowed_dirs = $xc->getConf('allowed_dirs');
        $ok = false;
        foreach ($allowed_dirs as $allowed_dir => $recursive) {
            $allowed_dir = $xc->realpath($allowed_dir);
            if ($allowed_dir === false) continue;
            if ($dir === $allowed_dir) {
                $ok = true;
                break;
            } elseif (strpos($dir, $allowed_dir) === 0 && $recursive) {
                $ok = true;
                break;
            }
        }
        if (!$ok) return display_error_and_quit(403);
        
        $this->pathStem = substr($path, 0, strrpos($path, '.'));
        
        if (!$xc->isFile($this->getSourcePath())) {
            // Apache may have redirected to an ErrorDocument which got directed
            // via mod_rewrite to us, in that case, output the corresponding
            // status code.  Otherwise, we can give the regular 404.
            $code = $xc->getRedirectStatus();
            if (!$code || $code == 200) $code = 404;
            display_error_and_quit($code);
        }
    }

Posted: Mon Mar 05, 2007 3:39 pm
by Mordred
You have to make sure that allowed dirs end in a /, and add a / to $dir
Otherwise recursive checks for /pages/in wil succeed for /pages/incognito

Code: Select all

$allowed_dir = $xc->realpath($allowed_dir); 
if ($allowed_dir === false) continue;
This looks like it belongs to the config init (i.e. you'll be sure that $allowed_dir contains only EXISTING allowed dirs, so no need to check again. But don't change this if this object will not be used often.)

You're still not using pathinfo, does it not work for nonexistent paths?

Posted: Mon Mar 05, 2007 7:05 pm
by Ambush Commander
You have to make sure that allowed dirs end in a /, and add a / to $dir
Otherwise recursive checks for /pages/in wil succeed for /pages/incognito
Righto. Will fix.
This looks like it belongs to the config init (i.e. you'll be sure that $allowed_dir contains only EXISTING allowed dirs, so no need to check again. But don't change this if this object will not be used often.)
That's a good point. I'm not sure the directory checking can be added to the initialization gracefully, but will try.
You're still not using pathinfo, does it not work for nonexistent paths?
No, it works for non-existent stuff.

Three hits right on the head! Thanks very much.