Catching System

Not for 'how-to' coding questions but PHP theory instead, this forum is here for those of us who wish to learn about design aspects of programming with PHP.

Moderator: General Moderators

Post Reply
User avatar
shiznatix
DevNet Master
Posts: 2745
Joined: Tue Dec 28, 2004 5:57 pm
Location: Tallinn, Estonia
Contact:

Catching System

Post by shiznatix »

I have 1 section on my website that is quite intense if the user has a lot of data (which is quite a few users). Basically, a lot of it is historical data but what is viewed the most is the data that changes day to day.

What I am looking for is some sort of catching system (custom or open source) that will allow me to catch the pages, maybe with the relevant data in a csv file or something that would make changing templates easy and the data quick and painless to retrieve. When it takes 8 seconds to load a page that is too much but I just can't see an easier way to do it. I have gone into trying to make my queries as streamlined as possible but its just not helping since the table has like 150000 rows and 23 fields which all have to be taken into consideration.

I want to do the calculations once, save them for easy future usage, and have the calculations redone if some of the relevant data has changed. I don't know how to do that so can someone give some advice or product name to help me get started on this.
User avatar
Ambush Commander
DevNet Master
Posts: 3698
Joined: Mon Oct 25, 2004 9:29 pm
Location: New Jersey, US

Post by Ambush Commander »

Do you mean caching?

It sounds like page-based caching using things like Squid is too coarse for you. I think you'll have to build a custom, computational reuse cache yourself. Is it possible/feasible for you to make a dependency list of pieces of data a calculation was derived from?
User avatar
John Cartwright
Site Admin
Posts: 11470
Joined: Tue Dec 23, 2003 2:10 am
Location: Toronto
Contact:

Post by John Cartwright »

It's as simple as having a last modified timestamp and checking the timestamp against current time to see if the cache file needs to be recreated. Because the nature of what you described, I would even go as far as to cron every 5 minutes or so to update old cache files instead of having to check if the cache file is still valid, therefore you can always assume the cache file is valid.
User avatar
Jenk
DevNet Master
Posts: 3587
Joined: Mon Sep 19, 2005 6:24 am
Location: London

Post by Jenk »

Code: Select all

if (file_exists($cacheFileName) && (time() - 60 * 5) < filemtime($cacheFileName)) {
    // cache file exists and is younger than 5 mins, no need to regenerate.
    echo file_get_contents($cacheFileName);
} else {
    // regenerate page here..
    file_put_contents($cacheFileName, $content);
    echo $content;
}
Simple example :)
User avatar
shiznatix
DevNet Master
Posts: 2745
Joined: Tue Dec 28, 2004 5:57 pm
Location: Tallinn, Estonia
Contact:

Post by shiznatix »

Yes I ment caching system, whoops :oops:

Jenk: good idea, simple and all that but not quite what I need. I need something that updates the users cache when either new data is added or data is changed.

So what I did was write a crazy caching system that goes like this:

Code: Select all

private function _update_cache($room_id, $year, $month, $data_date)
	{
		$usermap = $this->RAKETRACKING_USERMAP->get_list(array('fk_vb_user_id', 'room_username'), array('fk_room_id' => $room_id));
		
		foreach ($usermap as $val)
		{
			$data = get_raketracking_data($val->get('room_username'), $room_id, $data_date, $this->ROOMS->get('subtract_deductions_from_raketracking'), 1, $this->RAKETRACKING);
			
			if (false === $data['updated'])
			{
				$sql = '
					SELECT
						data_date, is_final
					FROM
						'.$this->RAKETRACKING->get('_table').'
					WHERE
						fk_room_id = "'.$room_id.'"
					AND
						data_date LIKE "'.$year.'-'.$month.'%"
					ORDER BY
						data_date
					DESC
					LIMIT 1
				';
				
				$query = $this->RAKETRACKING->_db->manual_query($sql);

				if ($this->RAKETRACKING->_db->manual_get_count($query))
				{
					$data_date_info = $this->RAKETRACKING->_db->manual_fetch_object($query);
					
					if ('0' == $data_date_info->is_final)
					{
						$date_pieces = explode('-', $data_date_info->data_date);
						$time_stamp = mktime(0, 0, 0, $date_pieces[1], $date_pieces[2], $date_pieces[0]);
						
						$data['updated'] = date('j M', $time_stamp);
					}
					else
					{
						$data['updated'] = '-1';
					}
				}
			}
			
			if (!is_dir(CACHE_PATH.$val->get('fk_vb_user_id')))
			{
				mkdir(CACHE_PATH.$val->get('fk_vb_user_id'));
				chmod(CACHE_PATH.$val->get('fk_vb_user_id'), 0777);
			}
			
			$file = CACHE_PATH.$val->get('fk_vb_user_id').'/'.$month.'-'.$year.'.csv';
			
			$contents = array();
			
			if (file_exists($file))
			{
				$contents = extract_data_from_cache($file);
				unset($contents[$this->ROOMS->get('id')]);
			}
			
			$contents[$this->ROOMS->get('id')] = array(
				'room_id' => $this->ROOMS->get('id'),
				'updated' => $data['updated'],
				'gross_rake' => $data['gross_rake'],
				'deductions' => $data['shown_deductions'],
				'net_rake' => $data['net_rake'],
				'rakeback_percent' => $data['rakeback_percent'],
				'rakeback' => $data['rakeback'],
			);
			
			$insert_data = '';
			
			foreach ($contents as $key => $room)
			{
$insert_data .= '==============
'.$room['room_id'].','.$room['updated'].','.$room['gross_rake'].','.$room['deductions'].',' .$room['net_rake'].','.$room['rakeback_percent'].','.$room['rakeback'].'
==============';
			}
			
			file_put_contents($file, $insert_data);
		}
	}
but the problem is that the function get_raketracking_data() is the memory intensive function that I am trying to get away from using on every load. Because it is so intensive, when we upload data for our biggest room I am updating the cache for thousands of users and this is taking 7 minutes to complete. 7 minutes is way to long. I need some way to fork this (sigh, I wish I could do threading) but I don't have a good idea right now. How can I break this up somehow to make it go faster or at least break it up into chunks that can take care of everything separately?
User avatar
Jenk
DevNet Master
Posts: 3587
Joined: Mon Sep 19, 2005 6:24 am
Location: London

Post by Jenk »

Sounds like a job for an observer, then when ever the data is changed, you can notify the observer. :)
User avatar
shiznatix
DevNet Master
Posts: 2745
Joined: Tue Dec 28, 2004 5:57 pm
Location: Tallinn, Estonia
Contact:

Post by shiznatix »

Jenk wrote:Sounds like a job for an observer, then when ever the data is changed, you can notify the observer. :)
could you elaborate some more?
User avatar
superdezign
DevNet Master
Posts: 4135
Joined: Sat Jan 20, 2007 11:06 pm

Post by superdezign »

Observer is a type of design pattern. Try PHPPatterns.com for an example, and if they don't have one, look it up on Wikipedia.
User avatar
shiznatix
DevNet Master
Posts: 2745
Joined: Tue Dec 28, 2004 5:57 pm
Location: Tallinn, Estonia
Contact:

Post by shiznatix »

ok I get the observer but I don't see how that will help. It is still going to take just as much time to do the update of the cache which essentially is the problem.
User avatar
superdezign
DevNet Master
Posts: 4135
Joined: Sat Jan 20, 2007 11:06 pm

Post by superdezign »

The observers job is to be notified when something happens and then notify other objects that have requested to be notified. In other words, you track the cache somewhere (session, database, etc.) and have the observer tell other objects how to handle the cached data (or lack of).
Post Reply