Page 1 of 1

Catching System

Posted: Thu Nov 22, 2007 6:51 am
by shiznatix
I have 1 section on my website that is quite intense if the user has a lot of data (which is quite a few users). Basically, a lot of it is historical data but what is viewed the most is the data that changes day to day.

What I am looking for is some sort of catching system (custom or open source) that will allow me to catch the pages, maybe with the relevant data in a csv file or something that would make changing templates easy and the data quick and painless to retrieve. When it takes 8 seconds to load a page that is too much but I just can't see an easier way to do it. I have gone into trying to make my queries as streamlined as possible but its just not helping since the table has like 150000 rows and 23 fields which all have to be taken into consideration.

I want to do the calculations once, save them for easy future usage, and have the calculations redone if some of the relevant data has changed. I don't know how to do that so can someone give some advice or product name to help me get started on this.

Posted: Thu Nov 22, 2007 6:58 am
by Ambush Commander
Do you mean caching?

It sounds like page-based caching using things like Squid is too coarse for you. I think you'll have to build a custom, computational reuse cache yourself. Is it possible/feasible for you to make a dependency list of pieces of data a calculation was derived from?

Posted: Thu Nov 22, 2007 7:23 am
by John Cartwright
It's as simple as having a last modified timestamp and checking the timestamp against current time to see if the cache file needs to be recreated. Because the nature of what you described, I would even go as far as to cron every 5 minutes or so to update old cache files instead of having to check if the cache file is still valid, therefore you can always assume the cache file is valid.

Posted: Thu Nov 22, 2007 9:19 am
by Jenk

Code: Select all

if (file_exists($cacheFileName) && (time() - 60 * 5) < filemtime($cacheFileName)) {
    // cache file exists and is younger than 5 mins, no need to regenerate.
    echo file_get_contents($cacheFileName);
} else {
    // regenerate page here..
    file_put_contents($cacheFileName, $content);
    echo $content;
}
Simple example :)

Posted: Fri Nov 23, 2007 7:24 am
by shiznatix
Yes I ment caching system, whoops :oops:

Jenk: good idea, simple and all that but not quite what I need. I need something that updates the users cache when either new data is added or data is changed.

So what I did was write a crazy caching system that goes like this:

Code: Select all

private function _update_cache($room_id, $year, $month, $data_date)
	{
		$usermap = $this->RAKETRACKING_USERMAP->get_list(array('fk_vb_user_id', 'room_username'), array('fk_room_id' => $room_id));
		
		foreach ($usermap as $val)
		{
			$data = get_raketracking_data($val->get('room_username'), $room_id, $data_date, $this->ROOMS->get('subtract_deductions_from_raketracking'), 1, $this->RAKETRACKING);
			
			if (false === $data['updated'])
			{
				$sql = '
					SELECT
						data_date, is_final
					FROM
						'.$this->RAKETRACKING->get('_table').'
					WHERE
						fk_room_id = "'.$room_id.'"
					AND
						data_date LIKE "'.$year.'-'.$month.'%"
					ORDER BY
						data_date
					DESC
					LIMIT 1
				';
				
				$query = $this->RAKETRACKING->_db->manual_query($sql);

				if ($this->RAKETRACKING->_db->manual_get_count($query))
				{
					$data_date_info = $this->RAKETRACKING->_db->manual_fetch_object($query);
					
					if ('0' == $data_date_info->is_final)
					{
						$date_pieces = explode('-', $data_date_info->data_date);
						$time_stamp = mktime(0, 0, 0, $date_pieces[1], $date_pieces[2], $date_pieces[0]);
						
						$data['updated'] = date('j M', $time_stamp);
					}
					else
					{
						$data['updated'] = '-1';
					}
				}
			}
			
			if (!is_dir(CACHE_PATH.$val->get('fk_vb_user_id')))
			{
				mkdir(CACHE_PATH.$val->get('fk_vb_user_id'));
				chmod(CACHE_PATH.$val->get('fk_vb_user_id'), 0777);
			}
			
			$file = CACHE_PATH.$val->get('fk_vb_user_id').'/'.$month.'-'.$year.'.csv';
			
			$contents = array();
			
			if (file_exists($file))
			{
				$contents = extract_data_from_cache($file);
				unset($contents[$this->ROOMS->get('id')]);
			}
			
			$contents[$this->ROOMS->get('id')] = array(
				'room_id' => $this->ROOMS->get('id'),
				'updated' => $data['updated'],
				'gross_rake' => $data['gross_rake'],
				'deductions' => $data['shown_deductions'],
				'net_rake' => $data['net_rake'],
				'rakeback_percent' => $data['rakeback_percent'],
				'rakeback' => $data['rakeback'],
			);
			
			$insert_data = '';
			
			foreach ($contents as $key => $room)
			{
$insert_data .= '==============
'.$room['room_id'].','.$room['updated'].','.$room['gross_rake'].','.$room['deductions'].',' .$room['net_rake'].','.$room['rakeback_percent'].','.$room['rakeback'].'
==============';
			}
			
			file_put_contents($file, $insert_data);
		}
	}
but the problem is that the function get_raketracking_data() is the memory intensive function that I am trying to get away from using on every load. Because it is so intensive, when we upload data for our biggest room I am updating the cache for thousands of users and this is taking 7 minutes to complete. 7 minutes is way to long. I need some way to fork this (sigh, I wish I could do threading) but I don't have a good idea right now. How can I break this up somehow to make it go faster or at least break it up into chunks that can take care of everything separately?

Posted: Fri Nov 23, 2007 7:49 am
by Jenk
Sounds like a job for an observer, then when ever the data is changed, you can notify the observer. :)

Posted: Fri Nov 23, 2007 8:24 am
by shiznatix
Jenk wrote:Sounds like a job for an observer, then when ever the data is changed, you can notify the observer. :)
could you elaborate some more?

Posted: Fri Nov 23, 2007 9:24 am
by superdezign
Observer is a type of design pattern. Try PHPPatterns.com for an example, and if they don't have one, look it up on Wikipedia.

Posted: Fri Nov 23, 2007 10:26 am
by shiznatix
ok I get the observer but I don't see how that will help. It is still going to take just as much time to do the update of the cache which essentially is the problem.

Posted: Fri Nov 23, 2007 10:41 am
by superdezign
The observers job is to be notified when something happens and then notify other objects that have requested to be notified. In other words, you track the cache somewhere (session, database, etc.) and have the observer tell other objects how to handle the cached data (or lack of).