Page 1 of 1

Download emails from Imap to DB - help

Posted: Mon Jan 12, 2009 7:37 am
by paritycheck
Hi guys, I'm working on a project that uses roundcube (from http://www.roundcube.net) and we've been tweaking it so now it reads emails from a local database. The thing is that we aim to use the interface of roundcube to read from a database as though it was an imap server. The thing is that we need to build a script that would download emails from the imap server and insert the emails into the database on a periodic basis.

The idea I have is that we would have a script that would do the downloading on a periodic basis and at the same time on the application end we would set up the 'check recent' function of roundcube to check for emails that have been downloaded and written to the database with a flags field marked as recent to be displayed in the interface. The thing is that we have built the code to download emails and it works quite well for single or multiple emails of small sizes - however the issue is that when it comes to having emails that are quite large and have huge attachments - we're also downloading the attachments btw the script dies midway must be due to losing connection or timeout. Considering the fact that we expect literally 100s of emails to come in every hour we would need to build the script so that it downloads without dying and making sure that we don't lose any emails. The idea is to download and delete emails from our imap server so our application functions something like a pop3.

How do we do this part I mean what kind of logic should we use to ensure that connections do not die or even if they do die there should be a way to resume from where we lost connection :(

Re: Download emails from Imap to DB - help

Posted: Mon Jan 12, 2009 1:30 pm
by pestaa
Make sure PHP has enough memory allocated to its process. I would advise to investigate the causes of those dropped connections.

Furthermore, I'm not convinced your script could handle as much amount of possible attachments as you mentioned, if it can't even download them. Maybe consider rethink the overall design of this fragment of the project.

Re: Download emails from Imap to DB - help

Posted: Mon Jan 12, 2009 11:22 pm
by paritycheck
Well the code snippet seems to work fine for smaller size attachments - however what would be a more suitable way to implement this? I'm open for ideas here :)

Re: Download emails from Imap to DB - help

Posted: Mon Jan 12, 2009 11:34 pm
by s.dot
Hmm, this could get interesting.

Surely you're not storing attachments in the database? I think I would write those to files and have some database flag to indicate if there is an attachment and then the location of the attachment.

When storing attachments, I'd keep a record of the total size of the attachment and then update this record as your storing it. If the byte sizes don't match - you have an incomplete attachment file. Alternatively you could use a checksum such as md5 on the file and compare the two checksums after transfer.

I think this type of handling would be much better as a daemon that you could easily stop/start/restart.

Re: Download emails from Imap to DB - help

Posted: Tue Jan 13, 2009 2:10 am
by paritycheck
scottayy wrote:Hmm, this could get interesting.

Surely you're not storing attachments in the database? I think I would write those to files and have some database flag to indicate if there is an attachment and then the location of the attachment.
Uuuuuh well for now I am storing them in the database :( sorry about that - although I can still rewire my code to write the contents to raw files though. I just thought it seemed pretty cool to have your file contents included in the same place as your file info.. :roll:

The way how its set up is that I have one table which contains teh information of teh email such as subject date attachment flag, message id, headers, message parts all serialized of course and a table for files which contain all files - sorry and one relationship table that links the two based upon message id, file id and attachment part.
scottayy wrote: I think this type of handling would be much better as a daemon that you could easily stop/start/restart.
Well the thing is that this going to be a multi user application - so we would have different users having different email accounts and each user would require downloading from their own email account. I do think that keeping the process of downloading from the imap server away from the actual client code is a good idea as the whole point is that the users would use the webmail client as though it is accessing the emails from an imap server when infact its really accessing from a database which is periodically updated with new emails.

The question is how do I implement this downloading new emails part considering that it is not for just one email but a set of emails. And likewise how do I work out the logic so that:
In case connection is lost - you don't end up with messy data
You don't duplicate entries - I've read that the UIDs are not static so they cant function well as a unique ID - although Ive read about the message ID generated for each email - or there is the option of deleting each email once downloaded so this issue is pretty much resolved.

Once more point is that - well I'm not aware of using cron or windows scheduled tasks - how would we set up such a scheduled task to run such that - lets say we set it up to run every 2 minutes - lets say our script at one point takes over 2 minutes to run - how do we set it up so it doesn't interfere with the script whilst it is being processed :?: