PHP Developers Network

A community of PHP developers offering assistance, advice, discussion, and friendship.
 
Loading
It is currently Wed Oct 18, 2017 1:17 pm

All times are UTC - 5 hours




Post new topic Reply to topic  [ 4 posts ] 
Author Message
PostPosted: Sun Sep 28, 2014 9:06 am 
Offline
Forum Contributor

Joined: Fri Dec 26, 2008 10:43 pm
Posts: 159
I'm currently testing an application that I wrote which performs a long routine that extracts e-mails via IMAP library. (It's an archival app.) What I'm running into with my tests is that after about 2 hours and 15 minutes, I receive disconnect errors in Firefox:

Quote:
The connection was reset
The connection to the server was reset while the page was loading.


I'm using WAMP on my Windows 7 machine, so I'm not sure if this error is coming from my server or else the server that my script is executing the POST against. There's no revealing errors in either the Apache error logs, the PHP error logs, or even the Windows system logs but the consistent thing that I am noticing is that the process ends almost exactly after 2 hours and 15 minutes.

Before I do anything in the script, I use PHP to change both the memory_limit and time_limit ini values to ensure no resource problems arise and I also have error_reporting set to -1 (to ensure that I see *all* errors). I've not seen a single run time error to mention, and the memory that PHP needs is always around 95x.xx KB before bombing out. (So I'm not sure if this is an issue with the resources I've allocated my WAMP server or else a security facility that terminates the connection from the server my script is extracting e-mails from due to the consistent duration I keep seeing of 2 hours and 15 or so minutes of processing.)

I'm curious: is there any way to trap these "Connection was reset" errors? I'm using the php-imap library found here: https://github.com/barbushin/php-imap. So far, it's done exactly what I've needed but I'm not sure if this is causing any issues or not.

If I could just figure out how to trap this outcome, I could add some logic that could reconnect to the e-mail server and continue extracting e-mails from where it left off... Otherwise, I'll have to do something that allows the user to simply restart the process, first checking for the last e-mail ID that was extracted and then continue extracting from there (which I don't want to have to do--I'd like to make it work continuously without needing manual intervention, etc.)

Any insight into this would be appreciated.


Top
 Profile  
 
PostPosted: Sun Sep 28, 2014 7:30 pm 
Offline
Spammer :|
User avatar

Joined: Wed Oct 15, 2008 2:35 am
Posts: 6573
Location: WA, USA
The web is not meant for requests that take a long time to complete. Certainly not something that will take hours. Move this processing to a command-line script and you won't have to worry about this; in fact I'm running a PHP script at work that's been going for two days straight and won't stop until sometime later this week.


Top
 Profile  
 
PostPosted: Sun Sep 28, 2014 7:45 pm 
Offline
Forum Contributor

Joined: Fri Dec 26, 2008 10:43 pm
Posts: 159
Quote:
The web is not meant for requests that take a long time to complete.


Don't you think that there's a time and place for everything? I understand why you'd say something like that but I'm just not sure that it's true in every situation. Certain circumstances merit certain approaches (granted, what I've done may not be the best approach but I think most of this whole issue is due to using it for the first time and having thousands of e-mails in my inbox [instead of a properly-archived mailbox that may have only a few hundred]; I just think that "playing catch-up" is causing this more than the act of the request). I still think the disconnects, though, are from the IDSs in play or else the app server configurations (either my own or else the remote system that I'm connecting to / making requests to from within my script).

Quote:
...Certainly not something that will take hours. Move this processing to a command-line script and you won't have to worry about this; in fact I'm running a PHP script at work that's been going for two days straight and won't stop until sometime later this week.


Is that what you're going to do with your week-long script then or are you saying that you've already done this with something like PHP CLI? (I'm not sure how you'd otherwise archive Outlook e-mails [via port 80?] using some sort of command-line script, but feel free to shed some light on this for me... I've never used PHP CLI so I'm not sure of the capabilities it has. Is that what you're referring to?

(Sorry for the confusion but thanks for the follow-up. :) )


Top
 Profile  
 
PostPosted: Tue Sep 30, 2014 2:19 am 
Offline
Spammer :|
User avatar

Joined: Wed Oct 15, 2008 2:35 am
Posts: 6573
Location: WA, USA
Wolf_22 wrote:
Don't you think that there's a time and place for everything? I understand why you'd say something like that but I'm just not sure that it's true in every situation.

Maybe I'm misreading you but you waaay over-generalized what I said. All I meant was that you shouldn't use the web for long-running processes because it wasn't built for that. I mean, half the reason AJAX exists is because the old method (a long-running connection that "streams" data) was technically laborious and broke easily. And long-running pages are a common attack vector for DDoSes, exhausting connections on both web and database servers.

Wolf_22 wrote:
Certain circumstances merit certain approaches

Yes...

Wolf_22 wrote:
(granted, what I've done may not be the best approach

Yes...

Wolf_22 wrote:
but I think most of this whole issue is due to using it for the first time and having thousands of e-mails in my inbox [instead of a properly-archived mailbox that may have only a few hundred];

You'd probably hit memory limits first. Then execution time, but I suspect it would have manifested in a way you'd notice immediately.

Wolf_22 wrote:
I just think that "playing catch-up" is causing this more than the act of the request).

By "the request" I mean mostly PHP. Not the literal request portion itself where the browser constructs the HTTP request and the server receives it.

Wolf_22 wrote:
I still think the disconnects, though, are from the IDSs in play or else the app server configurations (either my own or else the remote system that I'm connecting to / making requests to from within my script).

IDS being... intrusion detection system? Possible but not my first guess. Normally I'd think the operating system and/or your web server settings; if the OS then I'd expect errors, if Apache then, well, I'd still expect errors. If you're using up a lot of memory then it could even be the OS reacting, though I've only seen that happen in Linux and not Windows.

2.5 hours is 150 minutes. That's a very... human number. I wouldn't be surprised to see that number in some configuration settings.

Wolf_22 wrote:
Is that what you're going to do with your week-long script then or are you saying that you've already done this with something like PHP CLI?

Yeah: it's entirely command-line. The nature of the script makes it much easier to do from the command line to begin with, but had the reverse been true I still wouldn't have done it over HTTP.

Wolf_22 wrote:
(I'm not sure how you'd otherwise archive Outlook e-mails [via port 80?] using some sort of command-line script, but feel free to shed some light on this for me... I've never used PHP CLI so I'm not sure of the capabilities it has. Is that what you're referring to?

CLI PHP, Apache PHP, CGI PHP... those are all basically the same PHPs. This stuff with Outlook? Port 80 is HTTP so I'm not sure what it's doing... Unless you need to interact with the browser, like I don't know Javascript or ActiveX or something, I bet you could do it just fine from the command-line too. And if you don't use $_GET or $_POST then you may be able to just run it from the command-line directly without changes; even if you do, it's pretty easy to get information into a command-line PHP script.

Now I say all this, of course, without knowing exactly what this script is or what the code looks like, but if this 2.5 hour thing is taking away a lot of your time then it might be better invested in converting the script to work from the command line. So at the very least it's a numbers question: is it worth spending the time to fix it, or is it better to leave it as it is even with the problems?


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 4 posts ] 

All times are UTC - 5 hours


Who is online

Users browsing this forum: No registered users and 2 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Jump to:  
Powered by phpBB® Forum Software © phpBB Group