Help parsing HTML

Any questions involving matching text strings to patterns - the pattern is called a "regular expression."

Moderator: General Moderators

Post Reply
Airhead315
Forum Newbie
Posts: 1
Joined: Mon May 07, 2007 2:41 pm

Help parsing HTML

Post by Airhead315 »

feyd | Please use

Code: Select all

,

Code: Select all

and [syntax="..."] tags where appropriate when posting code. Your post has been edited to reflect how we'd like it posted. Please read:  [url=http://forums.devnetwork.net/viewtopic.php?t=21171]Posting Code in the Forums[/url] to learn how to do it too.[/color]


Im having probably the most rediculous problems with the simplest regular expression ever. Here is the code im trying to parse out of:
[syntax="html"]
<B>Original Message:</B>
<P>
<b>Posted by:</b> someguys name
  (<a href="mailto:someguy@somesite.com
">someguy@somesite.com
</a>)<BR>
<b>Organization:</b><a href="http://www.somesite.com
">JAQUET Ltd
</a> <BR><b>Date posted:</b> Thu Jan 14  5:23:11 US/Eastern 2001
<br>
<b>Subject:</b> some subject of a forum
<br>
<b>Message:</b><br> some long message that has no html tags in it
no breaks and no other weird charachters

I tried getting just one part of the meta data I wanted with the following code[/syntax]

Code: Select all

preg_match("/<b>Posted by:<\/b>(.*)<BR>/i", $parts[$i],$innerparts);
echo $innerparts[1];
As you can see im trying to get the Name/Email of the user who posted the message. However im not getting anything back("Undefined offset: 1")

$parts[$i] holds the content shown above.

I also tried the following lines with the same result

Code: Select all

preg_match("/\<b\>Posted by:\<\/b\>(.*)\<BR\>/i", $parts[$i],$innerparts);

preg_match("/<b>Posted by:<\/b>(.*?)<BR>/i", $parts[$i],$innerparts);

preg_match("/<b>Posted by:<\/b>(.*?)<BR>/i", $parts[$i],$innerparts);

feyd | Please use

Code: Select all

,

Code: Select all

and [syntax="..."] tags where appropriate when posting code. Your post has been edited to reflect how we'd like it posted. Please read:  [url=http://forums.devnetwork.net/viewtopic.php?t=21171]Posting Code in the Forums[/url] to learn how to do it too.[/color]
sentback
Forum Newbie
Posts: 24
Joined: Fri May 04, 2007 9:46 am

Post by sentback »

Description
int preg_match ( string $pattern, string $subject [, array &$matches [, int $flags [, int $offset]]] )

Searches subject for a match to the regular expression given in pattern.

Parameters

pattern

The pattern to search for, as a string.

subject

The input string.

matches

If matches is provided, then it is filled with the results of search. $matches[0] will contain the text that matched the full pattern,
$matches[1] will have the text that matched the first captured parenthesized subpattern, and so on.

flags

flags can be the following flag:
PREG_OFFSET_CAPTURE
If this flag is passed, for every occurring match the appendant string offset will also be returned. Note that this changes the return value in
an array where every element is an array consisting of the matched string at offset 0 and its string offset into subject at offset 1.

offset

Normally, the search starts from the beginning of the subject string. The optional parameter offset can be used to specify the alternate
place from which to start the search.
Post Reply