extract part of an email body

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
aaaaaaaa
Forum Newbie
Posts: 13
Joined: Wed Aug 19, 2009 8:43 am

extract part of an email body

Post by aaaaaaaa »

Hello,

I would like to extract only the newest written part of an email's body, and not the older messages. An e-mail looks like :

Code: Select all

 
new part of the message 
 
>part of the message
>that comes from the previous email
>that we don't want
 
. However, the email can have other way to put old messages, depending of the client and the user's config.

Is there any function from a library that can do that ?

------------ ------------ ------------
You' ll have some extra points if you could give me wich IRC channel do you use for php or web-related topics ?
Beside, what news tech site do you check ?

Thank in advance,
Bye,
Cedric
User avatar
Ollie Saunders
DevNet Master
Posts: 3179
Joined: Tue May 24, 2005 6:01 pm
Location: UK

Re: extract part of an email body

Post by Ollie Saunders »

Is there any function from a library that can do that ?
Probably yes. You may be able to write it from scratch quicker than you can find and integrate a library to do it. Difficult call.

You could use preg_replace() and /^>.*$/ to remove the lines beginning with ">".
User avatar
lord_webby
Forum Commoner
Posts: 44
Joined: Wed Aug 19, 2009 9:01 am

Re: extract part of an email body

Post by lord_webby »

Code: Select all

 
<?php
//get the position of part of the message
$pos = strpos($message, ">part of the message");
 
function truncate($text, $limit = 25, $ending = '...') {
    if (strlen($text) > $limit) {
        $text = strip_tags($text);
        $text = substr($text, 0, $limit);
        $text = substr($text, 0, -(strlen(strrchr($text, ' '))));
        $text = $text . $ending;
    }
    
    return $text;
}
 
//cut the end off
$text = truncate ($message, $position, "");
 
echo $text;
 
?>
 
Think that'll work - got a feeling there's another php function to do it though.
User avatar
Ollie Saunders
DevNet Master
Posts: 3179
Joined: Tue May 24, 2005 6:01 pm
Location: UK

Re: extract part of an email body

Post by Ollie Saunders »

lord_webby, that won't work where:

Code: Select all

$message = "I can't believe he actually gave this massive argument why 4 is actually > 9 and then told me my maths degree was worthless. What a douche!";
Sorry!
User avatar
lord_webby
Forum Commoner
Posts: 44
Joined: Wed Aug 19, 2009 9:01 am

Re: extract part of an email body

Post by lord_webby »

You need to have a common string at the start of every message to search for.
User avatar
lord_webby
Forum Commoner
Posts: 44
Joined: Wed Aug 19, 2009 9:01 am

Re: extract part of an email body

Post by lord_webby »

you could try searching for three newline characters followed by a ">" for example
User avatar
Ollie Saunders
DevNet Master
Posts: 3179
Joined: Tue May 24, 2005 6:01 pm
Location: UK

Re: extract part of an email body

Post by Ollie Saunders »

lord_webby wrote:You need to have a common string at the start of every message to search for.
Why?
User avatar
lord_webby
Forum Commoner
Posts: 44
Joined: Wed Aug 19, 2009 9:01 am

Re: extract part of an email body

Post by lord_webby »

Ollie Saunders wrote:
lord_webby wrote:You need to have a common string at the start of every message to search for.
Why?
To use the function above to chop off the old messages - you need some way of discerning what bits of the message are old. The strpos function requires a "needle".
aaaaaaaa
Forum Newbie
Posts: 13
Joined: Wed Aug 19, 2009 8:43 am

Re: extract part of an email body

Post by aaaaaaaa »

Thank you for your help Ollie Saunders and lord_webby.

However, it seems I haven't been clear enough. So, here is an other explanation with further examples.
What I would like to do is to extract different parts of an e-mail's body : the newest answer to the message, the answer that has been replied to,... In other word, I would like to split an e-mail, that is the different messages from a discussion between two people.

Sometime, answers look like that :

Code: Select all

Hi
 
bla bla
 
----- Original Message -----
From: <mail@mail.com>
To: "Name" <name@ploc.co.uk>
Sent: Saturday, August 32, 2019 11:56 PM
Subject: Re: where is my mind ?
 
Hi again dear you,
 
BLA !
 
I would then like to extract

Code: Select all

Hi
 
bla bla
But the problem is that depending on the language, the client, and the specific configuration of the user, answers can have other layouts, like

Code: Select all

here is the answer
 
Le 6 août 2015 21:47, <mee@mail.fr> a écrit :
 
    Ah ben trop tard O_o
    And the first message
 

and sometime, the newest part of the message is underneath, and not on the top of the e-mail.

It might be useful to know that every message that is received is stored in a database.
User avatar
Ollie Saunders
DevNet Master
Posts: 3179
Joined: Tue May 24, 2005 6:01 pm
Location: UK

Re: extract part of an email body

Post by Ollie Saunders »

It might be useful to know that every message that is received is stored in a database.
Yes, it is. Take the message you want to split up; retrieve from the database the message that it replies to; save both messages as files; use the filenames as arguments to the UNIX diff command and retain the output it produces. To get the two parts you're after, process that output selecting only the lines that begin with '>' and then, the same this time selecting only lines that begin with '<'. Delete the files you created. There are standard PHP function for creating and managing temporary files, splitting and searching strings, and executing shell commands.

If you don't like this method you could use a diff library for PHP. My preference is for UNIX diff because I know it works and it's already there.
aaaaaaaa
Forum Newbie
Posts: 13
Joined: Wed Aug 19, 2009 8:43 am

Re: extract part of an email body

Post by aaaaaaaa »

Thank you Ollie Saunders.

I will indeed try this.
Just a practical question : if I use unix commands, won't it be slower than the php functions (assuming that they do the same thing) ?
User avatar
lord_webby
Forum Commoner
Posts: 44
Joined: Wed Aug 19, 2009 9:01 am

Re: extract part of an email body

Post by lord_webby »

The difference should be negligible. And which is faster depends on the command - but I imagine linux is probably faster in general (php runs on linux so in general I think it should be slower) - but unless you working with a hundred thousand files I think you'll be alright. :D
Post Reply