All I'm doing is throwing out random ideas on what could be going wrong, and you dispute them with your own knowledge.
Let's get these facts straight:
1. This site is usually low-traffic, crawlers constitute heavy usage
2. During heavy usage periods, the site slows down, but not to the point of unusability
3. There is a performance problem
So, we know there is a performance problem, but we don't know what. I've thrown out some suggestions, and actually, I should be scolded for that. Only *you* know where the issues are coming from.
But you don't. That's why I suggest you profile. I'm not going to offer anymore possible slowdowns: this random stop 'n swop is probably incredibly wasteful and I'm glad you didn't do that. I'm not sure you exactly know what profiling does, so let me paste a sample profile output from one of my sites, generated by APD and processed by pprofd. Note that I did these profiles for fun, so there were no real slowdowns on the site yet.
(you may need to widen your screen)
Code: Select all
Trace for C:\Documents and Settings\Edward\My Documents\My Webs\wikistatus\index.php
Total Elapsed Time = 0.17
Total System Time = 0.03
Total User Time = 0.11
Real User System secs/ cumm
%Time (excl/cumm) (excl/cumm) (excl/cumm) Calls call s/call Name
--------------------------------------------------------------------------------------
42.9 0.06 0.06 0.05 0.05 0.00 0.00 11 0.0043 0.0043 defined
14.3 0.01 0.01 0.02 0.02 0.00 0.00 19 0.0008 0.0008 define
14.3 0.01 0.01 0.02 0.02 0.00 0.00 5 0.0031 0.0031 Mapper_Service->_getObject
14.3 0.04 0.04 0.02 0.02 0.00 0.00 1 0.0156 0.0156 mysql_connect
14.3 0.00 0.00 0.02 0.02 0.00 0.00 46 0.0003 0.0003 is_int
0.0 0.00 0.00 0.00 0.00 0.00 0.00 1 0.0000 0.0000 Plugin->getWebPath
0.0 0.00 0.00 0.00 0.00 0.00 0.00 4 0.0000 0.0000 is_array
0.0 0.00 0.01 0.00 0.02 0.00 0.00 1 0.0000 0.0156 Mapper_Service->findAll
0.0 0.00 0.00 0.00 0.00 0.00 0.00 10 0.0000 0.0000 mysql_fetch_array
0.0 0.00 0.01 0.00 0.02 0.00 0.00 1 0.0000 0.0156 Mapper_Service->_loadAll
0.0 0.00 0.00 0.00 0.00 0.00 0.00 8 0.0000 0.0000 ADORecordSet_mysql->MoveNext
0.0 0.00 0.00 0.00 0.00 0.00 0.00 5 0.0000 0.0000 Service->Service
0.0 0.00 0.00 0.00 0.00 0.00 0.00 5 0.0000 0.0000 Mapper_Service->_doLoad
0.0 0.00 0.01 0.00 0.02 0.00 0.00 5 0.0000 0.0031 Mapper_Service->_loadFromRow
0.0 0.00 0.00 0.00 0.00 0.00 0.00 2 0.0000 0.0000 ADORecordSet_mysql->_fetch
0.0 0.00 0.00 0.00 0.00 0.00 0.00 2 0.0000 0.0000 ADORecordSet_mysql->GetArray
0.0 0.00 0.00 0.00 0.00 0.00 0.00 2 0.0000 0.0000 mysql_num_fields
0.0 0.01 0.01 0.00 0.00 0.00 0.00 2 0.0000 0.0000 mysql_query
0.0 0.00 0.01 0.00 0.00 0.00 0.00 2 0.0000 0.0000 ADODB_mysql->_query
0.0 0.00 0.01 0.00 0.00 0.00 0.00 2 0.0000 0.0000 ADODB_mysql->_Execute
So it jumps out to me: defined() is taking up 42 percent of my execution time, and plus, I don't use it anywhere in my application! Further investigation reveals this (you can generate a calltree from the raw data too): these defined calls come from ADOdb, in order to ensure compatibility. If I wished to, I could remove them and hard code my customizations in the ADOdb file, however, since there is no major speed problem (0.1 seconds should be taken in context: it was taken on my 512MB memory, super-multitasking personal Windows machine), I chose not to. Mysql_connect also took a bit of time: calls to external memory resources always take a sizable chunk. Finallly, the 43 calls to is_int() are a bit dubious, and I may want to look into that too.
By profiling, I find out so much about my application, and this effect is even magnified by sites were there are performance problems (which manifest themselves when the site is under a heavier load).
Now, to answer a bit of your implementation questions:
Maybe I worded it wrong. When I said I use sessions to detect if they're logged in, I meant that I check for a specific session variable's existence, which would indicate they are logged in. Same thing as what you said, googlebots won't have user session variables. That was my point, the session issue shouldn't be causing a bog down.
If you always call session_start(), PHP is creating sessions, even if it's a googlebot.
probably spend more than $79 a month to host their site
Hmm... hosting for my sites cost $5 - $9 dollars a month. Do you mean per year? Unless this is a colo...
I have links that perform searches. So, googlebots are actually doing searches. And because they constantly check the same pages over and over again.
Is the search a datamining query which takes the longest amount of time? You may want to consider, 1. hiding all searches under forms to prevent googlebots from executing them (most sites are like that), 2. caching search results, at least for a few minutes or so (to help pagination)