Parsing Emails

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
aliasxneo
Forum Contributor
Posts: 136
Joined: Thu Aug 31, 2006 12:01 am

Parsing Emails

Post by aliasxneo »

Here is the current code I'm using for parsing emails:

Code: Select all

<?php

class mail
{
	var $content;
	var $header;
	var $from;
	var $to;
	var $subject;
	var $message;
	
	var $sql_host = "localhost";
	var $sql_user = "";
	var $sql_pass = "";
	var $sql_db = "";
	
	function read()
	{
		$handle = fopen("php://stdin", "r");
		
		while (!feof($handle)) {
			$this->content .= fread($handle, 1024);
		}
		fclose($handle);
		
		$this->parse();
	}
	
	function parse()
	{
		if (empty($this->content))
		{
			$this->new_error("No content recieved");
		}
		
		$lines = split("\n", $this->content);
		
		foreach ($lines as $line)
		{
			if ($gmessage)
			{
				$this->message .= $line . "\n";
				continue;
			}
			
			if (preg_match("/^Subject: (.*)/", $line, $matches))
			{
				$this->subject = $matches[1];
			}
			
			if (preg_match("/^From: (.*)/", $line, $matches)) 
			{
				$this->from = $matches[1];
			}
			
			if (preg_match("/^To: (.*)/", $line, $matches)) 
			{
				$this->to = $matches[1];
			}
			
			if (trim($line) == "")
			{
				$gmessage = TRUE;
			} else {
				$this->header .= $line . "\n";
			}
		}
		
		if (mysql_connect($this->sql_host, $this->sql_user, $this->sql_pass))
		{
			if (mysql_select_db($this->sql_db))
			{
				$this->message = mysql_escape_string($this->message);
				$sql = "INSERT INTO `messages` (`id`, `from`, `to`, `subject`, `message`, `read`) VALUES ";
				$sql .= "('', '" . $this->from . "', '" . $this->to . "', '" . $this->subject . "', '" . $this->message . "', 0)";
				mysql_query($sql);
			}
		} else {
			$this->new_error("Error connecting to database");
		}
	}
	
	function new_error($message)
	{
		$fh = fopen("/home/coolmail/public_html/error.txt","w+");
		fwrite($fh,$message);
		fclose($fh);
	}
	
}

$mail = new mail();
$mail->read();

?>
It works fine, everything get's inserted, except that I have one problem with the message part. Here is what the data inserted into the database looks like for the message part:
------=_Part_24188_30459578.1166483533205
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

I am just testing this new mal script!

It's cool eh?

------=_Part_24188_30459578.1166483533205
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

I am just testing this new mal script!<br><br>It's cool eh?<br>

------=_Part_24188_30459578.1166483533205--
It shouldn't be like that, and the tutorial that I read said that the message starts after a new line. Do all emails contain something like this? I sent the email using Gmail, is Gmail only specific with this? Also, does anyone have any good ideas on how I should parse this to get the HTML message out? Thanks.

Cheers,
- Josh
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

That's how emails look in reality. If you're not convinced you can see the same in Gmail too: while reading an email there is a down arrow to the right of the reply button (Image). If clicked it will show a dialog that contains "show original" which will open in a new window/tab.
aliasxneo
Forum Contributor
Posts: 136
Joined: Thu Aug 31, 2006 12:01 am

Post by aliasxneo »

So all emails contain a plain text version and an html version? I just want to be sure before I incorporate it into my script since I will be receiving emails from a variety of different systems.
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

No, not ever email contains them. Email has been around far longer than web pages and HTML. ;)
aliasxneo
Forum Contributor
Posts: 136
Joined: Thu Aug 31, 2006 12:01 am

Post by aliasxneo »

So my question is how can I ensure that the message will be displayed properly without things like the content-type tags I showed in my first post?
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

If it's handed (with the full headers) to an email client, it should work as the client will understand what parts to use. If this is for a web based email, the part you extract automatically would depend on the person's personal preferences to text versus html emails. The code required to extract the relevant section(s) can be complicated, for on the basic level quite simple. In the headers a "boundary" will be defined. When a new part is encountered, that boundary will be present.
Post Reply