massive loop troubles

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
User avatar
s.dot
Tranquility In Moderation
Posts: 5001
Joined: Sun Feb 06, 2005 7:18 pm
Location: Indiana

massive loop troubles

Post by s.dot »

I'm transferring databases, and rather than just export/import via mysql, I need to write scripts to manipulate data and then stick it in a while loop and import that way.

I'm having trouble when it comes to the posts that contain [ img ] bbcode tags. When they are parsed, I automatically resize the image using results from getimagesize() so it doesn't stretch my forums layout. This is causing me a couple problems.

#1, this is *really* slow. I have ~325,000 forum entries. I wouldn't mind it being slow except for problem #2

#2, it seems that the loop will get "stuck".. I presume trying to grab the size of an image that can't be found err errored or something. It seems it will stick on this entry trying to get the size, then.... for some odd reason, the loop will start over!

I know its trying to start over because I get an error of a duplicate key on my first ID in the table.

Does anyone have any recommendations or suggestions on what I should do?
How long does getimagesize() wait for a response from a remote server? What happens if it can't find the image? And why is my loop starting over?

About the first 100k entries are just plain text, before I added the bbcode functionality, and those all import fine and extremly quick like. So the problem occurs when I first start parsing the entries with the [ img ] tags.

:P 8O
Set Search Time - A google chrome extension. When you search only results from the past year (or set time period) are displayed. Helps tremendously when using new technologies to avoid outdated results.
User avatar
aerodromoi
Forum Contributor
Posts: 230
Joined: Sun May 07, 2006 5:21 am

Re: massive loop troubles

Post by aerodromoi »

scottayy wrote: Does anyone have any recommendations or suggestions on what I should do?
How long does getimagesize() wait for a response from a remote server? What happens if it can't find the image? And why is my loop starting over?
If it can't find the image, getimagesize returns false and an E_Warning. So you might want to allow for that.

aerodromoi

ps: Might not be too bad an idea to create a separate table containing all urls (and the id of the respective posts) which couldn't be found.
User avatar
s.dot
Tranquility In Moderation
Posts: 5001
Joined: Sun Feb 06, 2005 7:18 pm
Location: Indiana

Post by s.dot »

that's fine if it returns false, as it will just show up as a red x in the forums... which is fine by me.

I saw this post on php.net user comments
For those that like to go the dynamic thumbnail route, I've found that you can get warnings with getimagesize() after your loop through more than 3 to 4 images. In my case I needed 12 images on each page.

Use usleep() in your loop just before you run getimagesize() otherwise you'll end up with warnings, big images and a broken page. Using usleep() lets the server recoup for X milliseconds so it will accept connections again for the image size.

I've found that usleep(1500) is the best for my situation. This barely slows the page down and allows for getimagesize() to work 100% of the time for me.
i doubt it will make a difference, but I will try it. :)
Set Search Time - A google chrome extension. When you search only results from the past year (or set time period) are displayed. Helps tremendously when using new technologies to avoid outdated results.
User avatar
s.dot
Tranquility In Moderation
Posts: 5001
Joined: Sun Feb 06, 2005 7:18 pm
Location: Indiana

Post by s.dot »

wow, I'm about ~150,000 records through (halfway!) and this appears to be working. I think this tip should be more documented.
Set Search Time - A google chrome extension. When you search only results from the past year (or set time period) are displayed. Helps tremendously when using new technologies to avoid outdated results.
Post Reply