Document Based System has Unacceptable Load Time

Not for 'how-to' coding questions but PHP theory instead, this forum is here for those of us who wish to learn about design aspects of programming with PHP.

Moderator: General Moderators

User avatar
Eran
DevNet Master
Posts: 3549
Joined: Fri Jan 18, 2008 12:36 am
Location: Israel, ME

Re: Document Based System has Unacceptable Load Time

Post by Eran »

Well yes, the home server does have mysql, so it's certainly an option. I just can't CREATE the data as mysql data. It has to be imported.
My suggestion was that the server creates the XML files from the database on demand (meaning when a client requests a file). This way he always receives the most up-to-date version of the file and the data itself stays on the database. You can also keep generated files as cache that is invalidated when the data is changed.
User avatar
Skara
Forum Regular
Posts: 703
Joined: Sat Mar 12, 2005 7:13 pm
Location: US

Re: Document Based System has Unacceptable Load Time

Post by Skara »

Well.. I have bad news. It doesn't really speed anything up. O_o
Certainly I must be doing something wrong.

Not the most elegant code, but I threw this together quickly because I wanted to see what the speed difference would be.
Without boring you with too much code, what am I missing?

Code: Select all

function checkmd5($file) {
    preg_match('/(\d{10})\.order(?:\.void)?$/',$file,$matches);
    if (!isset($matches[1])) kill('78963');
    $time = $matches[1];
    //if doesn't exist, will return 0 (and therefore not match)
    $old = db_array("SELECT `md5` FROM `files` WHERE `time`='$time' LIMIT 1;");
    if (!$old) { //empty result
        return -1;
    }
    
    
    $new = md5(file_get_contents($file)); /////// is this where it's slowing down??
    $old = $old[0]['md5'];
    
    return ($new != $old);
}
 
function getfiledata($file,$location) {
    $chk = checkmd5($file);
    if ($chk) {
        debug('Updating '.$file.'...');
        updatefile($file,$location,$chk);
    }
    
    preg_match('/(\d{10})\.order(?:\.void)?$/',$file,$matches);
    if (!isset($matches[1])) kill('195735');
    $time = $matches[1];
    
    $data = db_array("SELECT * FROM `files` WHERE `time`='$time' LIMIT 1;");
    
    if ($data) return $data[0]; //one line of data, return the first. ^_^
    return 0;
}
 
// action = 1 or -1, update or insert
function updatefile($file,$location,$action=1) {
    
    preg_match('/(\d{10})\.order(?:\.void)?$/',$file,$matches);
    if (!isset($matches[1])) kill('739146');
    $time = $matches[1];
    
    $parser = new XMLParser($file, 'file', 1);
    $tree = $parser->getTree();
    
    //-------(get all the data), code removed---------
    
    $md5 = md5(file_get_contents($file));
    
    if ($action == 1) {
        mysql_query("UPDATE `files` SET `loc_id`='$location',`status`='$status',`name`='$name',`paid`='$paid',`reorder`='$reorder',`web`='$online',`mail`='$mail',`nodir`='$nodir',`noimg`='$nophotos',`thumb`='$thumb',`md5`='$md5' WHERE `time`='$time' LIMIT 1;");
    }
    else {
        mysql_query("INSERT INTO `files` VALUES('','$location','$time','$status','$name','$paid','$reorder','$online','$mail','$nodir','$nophotos','$thumb','$md5');");
    }
    
    $uhoh = mysql_error();
    if ($uhoh) {
        if (DEBUG) debug($uhoh);
        return 0;
    }
    
    return 1;
}
[edit] Yeah, creating them on-demand isn't really an option as the files are also used as actual files.. as in people can drag and drop them around.
User avatar
pickle
Briney Mod
Posts: 6445
Joined: Mon Jan 19, 2004 6:11 pm
Location: 53.01N x 112.48W
Contact:

Re: Document Based System has Unacceptable Load Time

Post by pickle »

The MD5 might be the bottleneck. I'd rethink your logic so you only have to hash the file contents once (you're doing it twice now).
Also, you don't really need to use preg_match on the filename, as you know the first 10 digits are going to be numbers - just use substr().

Keep in mind though - you're analyzing the contents of 1000+ files - that's never going to be a quick process.

If I were doing this, I'd use an OOP approach - treat each file as an object. That would help to organize the code & clean up access of (meta)data - such as the hash.

If you want to do some really simple "where's the bottleneck" testing, look into microtime().
Real programmers don't comment their code. If it was hard to write, it should be hard to understand.
User avatar
Skara
Forum Regular
Posts: 703
Joined: Sat Mar 12, 2005 7:13 pm
Location: US

Re: Document Based System has Unacceptable Load Time

Post by Skara »

Well, reworking the md5 helped with about .02 seconds. ^^; Reworking my bubblesort to be more efficient took off .5 seconds. :oops:

After a few more little changes I got it down to ~.22 seconds or so. That may still not be great with 1000 records, but I think it'll at least do for now.

Thanks, all!
josh
DevNet Master
Posts: 4872
Joined: Wed Feb 11, 2004 3:23 pm
Location: Palm beach, Florida

Re: Document Based System has Unacceptable Load Time

Post by josh »

Bubblesort??? You know the DBMS would have fixed that in a fraction of the time you spent probably.
Post Reply