Quick description:
I need to be able to grab certain variables out of external forums, such as thread titles, post count, etc etc and then I am going to put them all into a database for use with my application.
Right now I can easily build the complete framework for everything BUT grabbing the variables out of code from the forum.
I will just use one as an example.
http://www.techpb.com/forum/index.php?showforum=986
I know that the repeating factor is:
Code: Select all
<tr class="row2" id="trow_82122">
<td class="short altrow"><img src="http://www.techpb.com/forum/public/style_images/master/t_hot_unread.png" alt="Hot Topic (New)"></td>
<td>
<a href="http://www.techpb.com/forum/index.php?showtopic=82122&view=getnewpost" title="Go to first unread post"><img src="http://www.techpb.com/forum/public/style_images/master/new_post.png" alt="Icon" title="Go to first unread post"></a>
<a id="tid-link-82122" href="http://www.techpb.com/forum/index.php?showtopic=82122" title="View topic, started 08 March 2010 - 04:52 PM" class="topic_title">Shocker SFT- Blue</a>
<br><span class="desc">Many new parts installed</span>
</td>
<td class="short altrow"><a href="http://www.techpb.com/forum/index.php?showuser=4047">MeliFelipe</a> <a href="http://www.techpb.com/forum/index.php?showuser=4047" class="__user __id4047" title="View Profile"><img src="http://www.techpb.com/forum/public/style_images/master/user_popup.png" alt="Icon"></a></td>
<td class="stats">
<ul>
<li><!-- SKINNOTE: This is the link for the "who posted" ajax popup -->
<a href="http://www.techpb.com/forum/index.php?app=forums&module=extras§ion=stats&do=who&t=82122" onclick="return ipb.forums.retrieveWhoPosted( 82122 );">15</a> Replies</li>
<li class="views desc">399 Views</li>
</ul>
</td>
<td class="altrow">
<ul class="last_post">
<li>
<a href="http://www.techpb.com/forum/index.php?showtopic=82122&view=getlastpost" title="Go to last post"><img src="http://www.techpb.com/forum/public/style_images/master/last_post.png" alt="Icon" title="View last post"></a> <a href="http://www.techpb.com/forum/index.php?showtopic=82122&view=getlastpost" title="Go to last post">Today, 03:04 PM</a>
</li>
<li>By: <a href="http://www.techpb.com/forum/index.php?showuser=9406">Chace365</a> <a href="http://www.techpb.com/forum/index.php?showuser=9406" class="__user __id9406" title="View Profile"><img src="http://www.techpb.com/forum/public/style_images/master/user_popup.png" alt="Icon"></a></li>
</ul>
</td>
</tr>
Please note I didn't do the regex all the way through because I wanted to test it and see if I could get it to work... I failed. I can't even get ONE variable to work.
Code: Select all
<html>
<head>
<title>Page Rip</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
</head>
<body>
<?php
//Access Web Page for Gathering Data
$html = file_get_contents("http://www.techpb.com/forum/index.php?showforum=986");
preg_match_all(
'/<tr class=".*?" id=".*?"><td class="short altrow"><img src=".*?" alt=".*?"></td><td><a href=".*?" title=".*?"><img src=".*?" alt="Icon" title="Go to first unread post"><\/a><a id=".*?" href=".(.*?)" title="View topic, started .(.*?)" class="topic_title">.(.*?)<\/a><br><span class="desc">.(.*?)<\/span><\/td><td class="short altrow"><a href=".*?">.(.*?)<\/a> <a href=".*?" class=".*?" title="View Profile"><img src="http://www.techpb.com/forum/public/style_images/master/user_popup.png" alt="Icon"><\/a><\/td>.*?<\/tr>/s',
$html,
$posts, // Stores all variables
PREG_SET_ORDER // Formats data into an array of posts
);
foreach ($posts as $post) {
$link = $post[1];
$startdate = $post[2];
$threadtitle = $post[3];
$description = $post[4];
$author = $post[5];
//Printout the variables
echo $link;
echo $startdate;
echo $threadtitle;
echo $description;
echo $author;
}
?>
</body>
</html>
Any help would be super appreciated.