Validating a filename

Discussions of secure PHP coding. Security in software is important, so don't be afraid to ask. And when answering: be anal. Nitpick. No security vulnerability is too small.

Moderator: General Moderators

Post Reply
User avatar
Ambush Commander
DevNet Master
Posts: 3698
Joined: Mon Oct 25, 2004 9:29 pm
Location: New Jersey, US

Validating a filename

Post by Ambush Commander »

Quite frankly, I'm scared of this function. Resting upon it is the entire integrity of every file on my filesystem. The last time I had to implement something like this, I went for a pure whitelist.

But, anyway, is this secure?

Code: Select all

/**
 * Validates a page name, making sure that is allowed and in proper format;
 * halts page execution if invalid
 * @param $page string page name to validate
 * @param $allowed_dirs lookup array of allowed directories, see config.default.php
 * @param $filename_chars PCRE style string of characters, see config.default.php
 */
function validate_page($page, $allowed_dirs, $filename_chars) {
    // validate the path, perhaps syntax could be more permissive
    $regex = "#(((?:[$filename_chars]+/)*)[$filename_chars]+).html#";
    $status = preg_match($regex, $page, $matches);
    if (!$status) display_error_and_quit(403);
    
    // validate directory
    if (!isset($allowed_dirs[$matches[2]])) {
        // maybe one of its parent directories had a recursive declaration
        $dir = $matches[2];
        $dirs = explode('/', $dir);
        $test_dir = '';
        $ok = false;
        foreach ($dirs as $name) {
            if ($name === '') break;
            if (
                isset($allowed_dirs[$test_dir]) &&
                $allowed_dirs[$test_dir] === 1
            ) {
                $ok = true;
                break;
            } else {
                $test_dir .= $name . '/';
            }
        }
        if (!$ok) display_error_and_quit(403);
    }
}
If this function passes, the resulting $file will be accessed and its contents output. Typical call looks like:

Code: Select all

validate_page($_GET['f'], array('' => 0, 'data/' => 1), 'a-zA-Z0-9\-_');
User avatar
Mordred
DevNet Resident
Posts: 1579
Joined: Sun Sep 03, 2006 5:19 am
Location: Sofia, Bulgaria

Post by Mordred »

I think you should consider another approach.
Use realpath() on the name, and then pathinfo() to parse it. Check if the $allowed_dirs exist.
Usually homebrewed solutions (when library variants exist) are not very successfull :)

Yours in particular is (theoretically) flawed: allowed_dir/../evil_dir/passwords.html
Of course you disallow the dot in $filename_chars, but if a user wants to have directories with . in them (which is a valid reason), then the above will suddenly begin to work. The way to deal with .. is realpath, not disallowing of dots.

Actually, if you have a symlink to another dir, the "exploit" will work even without dots. (with realpath, again, you're safe)
User avatar
Ambush Commander
DevNet Master
Posts: 3698
Joined: Mon Oct 25, 2004 9:29 pm
Location: New Jersey, US

Post by Ambush Commander »

Use realpath() on the name, and then pathinfo() to parse it. Check if the $allowed_dirs exist.
I previously tried implementing the function with realpath, but hit some snags. Namely, realpath fails when the file doesn't exist, and I cannot guarantee that $page exists (in fact, it probably won't exist). Any tricks around this? I suppose I could check for the corresponding source file, but that would require some major refactoring.
Yours in particular is (theoretically) flawed: allowed_dir/../evil_dir/passwords.html
Of course you disallow the dot in $filename_chars, but if a user wants to have directories with . in them (which is a valid reason), then the above will suddenly begin to work. The way to deal with .. is realpath, not disallowing of dots.
You're right. I think I'll add checks against double-dots in the directory check, that should fix it. Given the directory checks, theoretically speaking, a double dot attack would only be possible when you've recursively allowed a directory.
Actually, if you have a symlink to another dir, the "exploit" will work even without dots. (with realpath, again, you're safe)
I guess I could hope that the user was smart enough not to put any symlinks in the directory. :-/ Not good.
User avatar
WaldoMonster
Forum Contributor
Posts: 225
Joined: Mon Apr 19, 2004 6:19 pm
Contact:

Post by WaldoMonster »

Ambush Commander wrote:
Use realpath() on the name, and then pathinfo() to parse it. Check if the $allowed_dirs exist.
I previously tried implementing the function with realpath, but hit some snags. Namely, realpath fails when the file doesn't exist, and I cannot guarantee that $page exists (in fact, it probably won't exist). Any tricks around this? I suppose I could check for the corresponding source file, but that would require some major refactoring.

Code: Select all

$dir = @realpath($dir) or display_error_and_quit(403);
User avatar
Ambush Commander
DevNet Master
Posts: 3698
Joined: Mon Oct 25, 2004 9:29 pm
Location: New Jersey, US

Post by Ambush Commander »

WaldoMonster wrote:

Code: Select all

$dir = @realpath($dir) or display_error_and_quit(403);
Actually, that's pretty good. Don't realpath the filename: realpath the directory. If the directory doesn't exist, I have no qualms of scratching it. After I realpath, though, I still need to check it against the allowed dirs array, which is of relative paths. Maybe realpath those too?
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

yep, realpath them too. :)
User avatar
Ambush Commander
DevNet Master
Posts: 3698
Joined: Mon Oct 25, 2004 9:29 pm
Location: New Jersey, US

Post by Ambush Commander »

Just discovered a really huge major security hole in the function. The regexp actually doesn't properly check the string: there's no beginning and end anchors in the expression!
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

Quite true. :)
User avatar
Mordred
DevNet Resident
Posts: 1579
Joined: Sun Sep 03, 2006 5:19 am
Location: Sofia, Bulgaria

Post by Mordred »

Ambush Commander wrote:Just discovered a really huge major security hole in the function. The regexp actually doesn't properly check the string: there's no beginning and end anchors in the expression!
LOL, so much for our audit super-powers ;)
User avatar
Ambush Commander
DevNet Master
Posts: 3698
Joined: Mon Oct 25, 2004 9:29 pm
Location: New Jersey, US

Post by Ambush Commander »

Here's the revised code, all neatly OOPed out. A bit large though...

Code: Select all

public function __construct($path) {
        $xc = XHTMLCompiler::getInstance();
        // test file extension, must be html or xhtml
        $ext = strrchr($path, '.');
        if ($ext !== $this->sourceExt && $ext !== $this->cacheExt) {
            return display_error_and_quit(403);
        }
        $dir = dirname($path);
        // test for directory's existence, simultaneously resolve
        // to full path
        $dir = $xc->realpath($dir);
        if ($dir === false) return display_error_and_quit(403);
        $allowed_dirs = $xc->getConf('allowed_dirs');
        $ok = false;
        foreach ($allowed_dirs as $allowed_dir => $recursive) {
            $allowed_dir = $xc->realpath($allowed_dir);
            if ($allowed_dir === false) continue;
            if ($dir === $allowed_dir) {
                $ok = true;
                break;
            } elseif (strpos($dir, $allowed_dir) === 0 && $recursive) {
                $ok = true;
                break;
            }
        }
        if (!$ok) return display_error_and_quit(403);
        
        $this->pathStem = substr($path, 0, strrpos($path, '.'));
        
        if (!$xc->isFile($this->getSourcePath())) {
            // Apache may have redirected to an ErrorDocument which got directed
            // via mod_rewrite to us, in that case, output the corresponding
            // status code.  Otherwise, we can give the regular 404.
            $code = $xc->getRedirectStatus();
            if (!$code || $code == 200) $code = 404;
            display_error_and_quit($code);
        }
    }
User avatar
Mordred
DevNet Resident
Posts: 1579
Joined: Sun Sep 03, 2006 5:19 am
Location: Sofia, Bulgaria

Post by Mordred »

You have to make sure that allowed dirs end in a /, and add a / to $dir
Otherwise recursive checks for /pages/in wil succeed for /pages/incognito

Code: Select all

$allowed_dir = $xc->realpath($allowed_dir); 
if ($allowed_dir === false) continue;
This looks like it belongs to the config init (i.e. you'll be sure that $allowed_dir contains only EXISTING allowed dirs, so no need to check again. But don't change this if this object will not be used often.)

You're still not using pathinfo, does it not work for nonexistent paths?
User avatar
Ambush Commander
DevNet Master
Posts: 3698
Joined: Mon Oct 25, 2004 9:29 pm
Location: New Jersey, US

Post by Ambush Commander »

You have to make sure that allowed dirs end in a /, and add a / to $dir
Otherwise recursive checks for /pages/in wil succeed for /pages/incognito
Righto. Will fix.
This looks like it belongs to the config init (i.e. you'll be sure that $allowed_dir contains only EXISTING allowed dirs, so no need to check again. But don't change this if this object will not be used often.)
That's a good point. I'm not sure the directory checking can be added to the initialization gracefully, but will try.
You're still not using pathinfo, does it not work for nonexistent paths?
No, it works for non-existent stuff.

Three hits right on the head! Thanks very much.
Post Reply