Background script (crontab) causing heavy load

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Yaniv
Forum Newbie
Posts: 13
Joined: Tue Jan 09, 2007 6:28 am

Background script (crontab) causing heavy load

Post by Yaniv »

Hi everybody,

I have a situation here. I have a script that runs as a crontab every half an hour that is meant to do some background website operations. The script runs fine and the crontab works as it should.
The problem is that the system admin reported to me that the script is taking quite a heavy load on the server (this is a shared host).

I tried "nice-ing" the script by lowering its priority but that doesn't seem to work. I have reports that the script takes about 60% of the server's processor which is really bad. These background operations must be made.

Is there another PHP way of making the script run slower? (other than making sleep commands manually). I do not mind the script running slower because these are background tasks.


Thanks a lot!
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

What sort of operations is it carrying out? Typically a 60% hit over a very short period of time isn't a problem, but if your script does this over say a second, it can cause issues with your hosting. What I'm getting at is that your script may benefit greatly from optimizations that not only help shorten the time requirement for it to run but also lighten the server load.

Have you considered dedicated hosting? I often view the need for crontab as a signal that one should probably move to a hosting solution where running heavy operations isn't an issue.
Yaniv
Forum Newbie
Posts: 13
Joined: Tue Jan 09, 2007 6:28 am

Post by Yaniv »

It's basically background statistics calculations that can't be made by a user triggering a page load because some calculations are user independent.
The problem is that no matter how efficient the script may get the more data there is on the database the longer it will take for the script to run. This is a behavior I am actually expecting. I am expecting the script to eventually get to a few minutes of operation. That is why this is an issue.

The script is not time critical that is why I would like to simply have the solution of slowing it down.

Dedicated hosting is not an option yet due to current low budget. In the future this switch will definitely be made.
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

You probably should seek dedicated hosting now, rather than later. Sleep() and usleep() are pretty much your only other solution until you can provide us with more information.

There may be alternate or more optimized solutions to these statistical operations. If you could tell us more about these statistics we may be able to offer help in fixing up the code.
Yaniv
Forum Newbie
Posts: 13
Joined: Tue Jan 09, 2007 6:28 am

Post by Yaniv »

It's kind of similar in concept to Amazon's "Other users who bought this book also bought these books" kind of script but not exactly the same. The script is also dependent on a certain profile each user has. So if the user updates the profile (the profile being something I can't say here, sorry) all objects the user is associated with must be updated as well. The profile is not just texts it's another table associated to other data.

So I can't simply update the statistics whenever a user does an operation on an object because if that user later decides to change his profile all associated objects statistics must be updated. That kind of operation will take too long for doing it on a page load.

I'm really sorry for not being able to say too much about the insides of the project because I am obligated for secrecy by contract. I will try to explain as much as I can.

Thanks
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

So I'm guessing it's the JOIN that's taking so long? Have you considered caching the information? For example, when an order is placed a record is created for each line item and the order id in this cache table. When you are pulling up an item you can join from this table to the order line item table to find other item id's using a distinct and order by rand() with a limit. You can also do the same with a self join on the cache table.
Yaniv
Forum Newbie
Posts: 13
Joined: Tue Jan 09, 2007 6:28 am

Post by Yaniv »

I think my previous explanation wasn't too great. Here's a better shot at it:
The update algorithm generally works like so:

Three main tables:
Objects
ObjectsUsers
Users
UsersChoices
Choices

Code: Select all

objects = (Retrieve list of Objects from DB)
foreach(objects)
{
    users = (retrieve list of users who are linked to this object)
    foreach(users)
   {
      profile = (retrieve list of user choices for this user)
      (count each choice to a large list of choices statistics; each choice +1 if a user chose it)
   }
}

(Send the choices statistics back to the database)
This is just one of the statistics the system must carry. But it's the main one.
Yaniv
Forum Newbie
Posts: 13
Joined: Tue Jan 09, 2007 6:28 am

Post by Yaniv »

A buddy of mine gave me the idea of using sub-queries to get this done faster on the MySQL server than on PHP. I'm gonna try and figure out how to do this as SQL. My biggest concern is doing the profile counting but I assume it's possible.
ykarmi
Forum Commoner
Posts: 35
Joined: Mon Oct 30, 2006 4:45 pm

Yaniv!

Post by ykarmi »

Ma kore gever gever?
Me eigo ata?
Yaniv
Forum Newbie
Posts: 13
Joined: Tue Jan 09, 2007 6:28 am

Post by Yaniv »

I'm having some trouble working out the SQL for this.
The thing is I have to calculate the number of appearances of each choice for each object.
As in get each user's "UserChoices" and calculate that with "ObjectsUsers". ObjectsUsers also contains a numeric value that says how the user rated this object (+# or -#)

This is how the tables connect to each other.

Objects <-> ObjectsUsers <-> Users <-> UsersChoices <-> Choices

And I want to do all that in pure SQL if possible.

Can anyone help? Or at least turn me to some information about such complex queries? Thanks!
Grim...
DevNet Resident
Posts: 1445
Joined: Tue May 18, 2004 5:32 am
Location: London, UK

Post by Grim... »

I'm ready to help you out but I'm still not sure what you want.

How about mocking up some fake SQL tables for us to work with, but changing the text and any give-away field names.
Grim...
DevNet Resident
Posts: 1445
Joined: Tue May 18, 2004 5:32 am
Location: London, UK

Post by Grim... »

If I understand you correctly, you might need something like this:

Code: Select all

SELECT SUM(IF(choices.id!='',1,0)) AS choice_count FROM objects
LEFT JOIN objectsusers ON objectsusers.object_id = objects.id
LEFT JOIN users ON users.objectsuser_id = objectsusers.id
LEFT JOIN userschoices ON userschoices.user_id = users.id
LEFT JOIN choices ON choices.userschoice_id = choices.id
GROUP BY objects.id
Yaniv
Forum Newbie
Posts: 13
Joined: Tue Jan 09, 2007 6:28 am

Post by Yaniv »

This is generally how the tables are laid out:

Code: Select all

CREATE TABLE `object`
(
	`id` INTEGER  NOT NULL AUTO_INCREMENT COMMENT '',

	PRIMARY KEY (`id`),
)Type=MyISAM;

CREATE TABLE `object_user`
(
	`user_id` INTEGER  NOT NULL COMMENT '',
	`object_id` INTEGER  NOT NULL COMMENT '',
	`rating` INTEGER  NOT NULL COMMENT 'Visit rating',

)Type=MyISAM;

CREATE TABLE `user`
(
	`id` INTEGER  NOT NULL AUTO_INCREMENT COMMENT '',

	PRIMARY KEY (`id`),
)Type=MyISAM;

CREATE TABLE `user_choice`
(
	`user_id` INTEGER  NOT NULL COMMENT '',
	`choice_id` INTEGER  NOT NULL,

	PRIMARY KEY (`user_id`,`choice_id`),
	CONSTRAINT `user_choice_FK_1`
		FOREIGN KEY (`user_id`)
		REFERENCES `user` (`id`),
	INDEX `user_choice_FI_2` (`choice_id`),
	CONSTRAINT `user_choice_FK_2`
		FOREIGN KEY (`choice_id`)
		REFERENCES `choice` (`id`)
)Type=MyISAM;

CREATE TABLE `choice`
(
	`id` INTEGER  NOT NULL AUTO_INCREMENT COMMENT '',
	PRIMARY KEY (`id`),
)Type=MyISAM;
Now I want to SUM the number of of choices each object has according to the `object_user` table, while considering the `rating` field in `object_user`.
Meaning that if I have this data:

Code: Select all

`object`
| id      |
-----------
| 1       |

`object_user`
| user_id | object_id | rating |
--------------------------------
| 1       | 1         | 1      |
| 1       | 1         | 1      |
| 1       | 1         | -1     |
| 1       | 1         | 1      |
| 2       | 1         | 1      |
| 2       | 1         | 1      |
| 2       | 1         | 1      |

`user`
| id      |
-----------
| 1       |
| 2       |

`user_choice`
| user_id | choice_id |
-----------------------
| 1       | 1         |
| 1       | 2         |
| 2       | 2         |
| 2       | 3         |

`choice`
| id      |
-----------
| 1       |
| 2       |
| 3       |
The result for object 1 should be

Code: Select all

| choice_id | result   |
------------------------
| 1         | 2        | (because only user 1 chose choice 1)
| 2         | 5        | (because both users chose choice 2)
| 3         | 3        | (because only user 2 chose choice 3)
I hope this helps.

(another thing)

It will be best to get the result for all tables together. Meaning something like this:

Code: Select all

|bubble_id | choice_id | result   |
------------------------
| 1         | 1         | 2        | 
| 1         | 2         | 5        |
| 1         | 3         | 3        | 
| 2         | X         | Y        |
It will increase the code speed by alot. But I can't figure out how to do this.
Grim...
DevNet Resident
Posts: 1445
Joined: Tue May 18, 2004 5:32 am
Location: London, UK

Post by Grim... »

So it gets one 'point' for each user that chose it, plus whatever ratings it has?
programmingjeff
Forum Commoner
Posts: 26
Joined: Fri Jan 05, 2007 10:56 am

Post by programmingjeff »

Have you considered using Stored Procedures? They are fairly new in MySQL...
Post Reply