Extract Substing from complex text

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
User avatar
ed209
Forum Contributor
Posts: 153
Joined: Thu May 12, 2005 5:06 am
Location: UK

Extract Substing from complex text

Post by ed209 »

I am attempting to receive emails and sort through the raw source to extract the infomation I require. The information I need is attachments, subject, message and To. I have looked into how emails are made up and it seems that they are divided into various sections.

Those sections are defined by:

Code: Select all

Content-Type: multipart/mixed; boundary="----=_NextPart_000_58b1_26fb_1faf"
where 'boundary' defines the start of a new section. A section may look like (this would be the message section):

Code: Select all

------=_NextPart_000_58b1_26fb_1faf

Content-Type: text/plain; format=flowed



this is the message

------=_NextPart_000_58b1_26fb_1faf
So I can find the relevant section by extracting infomation between 'boundary'.

My question is, how should I search through the source and extract (then work with) these sections? Emails are likely to be a few Mb's due to attachments.

Should I:

Code: Select all

explode($boudary, $email_source);
or

find the occurences of 'boundary' and substr them

or

use some sort of preg_match()



??????
User avatar
ed209
Forum Contributor
Posts: 153
Joined: Thu May 12, 2005 5:06 am
Location: UK

Post by ed209 »

incase anyone stumbles accross this post in the future, it may be easier to use PHP imap function family.

Code: Select all

// you can also connect to pop3
$mbox = imap_open ("{localhost:110/pop3}INBOX", "user", "password");
$structure = imap_fetchstructure($mbox,$msgNum);

// this gives you a breakdown of erevything in the email.
print_r($structure);
Post Reply