Explode Problem!

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
bncplix
Forum Newbie
Posts: 7
Joined: Sat Feb 07, 2009 10:14 am

Explode Problem!

Post by bncplix »

Ok, so I am making something that will rip down a forum by parsing the thead IDS and the thread name.

So far I have it working to rip off the forum IDS and print them, then I use that forum ID to try and go back and parse again to get the thread ID. It will only get the first forum name and then stop

Here is the script:

Code: Select all

 
<?php
$Site = file_get_contents('http://forum.tibia.com/forum/?subtopic=worldboards');
$boardArray[0] = "";
for ($i = 1; $i <= 74; $i += 1) { 
$splitA = explode('boardid=', $Site);
$splitB = explode('">',$splitA[$i]);
$boardArray[$i] = $splitB[0];
$splitC = explode('boardid='.$splitB[0].'">', $Site);
$splitD = explode('</a><br>',$splitC[$i]);
echo '<p> '.$i.'. '.$splitB[0].' - '.$splitD[0];
}
?>
 
 
Does anybody know why it only grabs the first forum name and stops? It works fine getting the IDS. Also, if i just do
echo 'boardid='.$boardArray[$i].'">'; It will print out excactly how it should split, so its not a mistake in that.
User avatar
John Cartwright
Site Admin
Posts: 11470
Joined: Tue Dec 23, 2003 2:10 am
Location: Toronto
Contact:

Re: Explode Problem!

Post by John Cartwright »

Post some sample HTML of what you are trying to parse. This can be much more easily accomplished with regular expression.

Maybe something like

Code: Select all

$html = file_get_contents('http://forum.tibia.com/forum/?subtopic=worldboards');
 
preg_match_all('#boardid="([^"]+)#', $html, $matches);
 
echo '<pre>';
print_r($matches);
I would also suggest you make sure you have permission to scrape their forum before doing so.
bncplix
Forum Newbie
Posts: 7
Joined: Sat Feb 07, 2009 10:14 am

Re: Explode Problem!

Post by bncplix »

John Cartwright wrote:Post some sample HTML of what you are trying to parse. This can be much more easily accomplished with regular expression.

Maybe something like

Code: Select all

$html = file_get_contents('http://forum.tibia.com/forum/?subtopic=worldboards');
 
preg_match_all('#boardid="([^"]+)#', $html, $matches);
 
echo '<pre>';
print_r($matches);
I would also suggest you make sure you have permission to scrape their forum before doing so.
Yep I do :)

and thanks, i looked at regular expresions before but it looked a bit confusing :/

Just one more thing I am having a problem with
I am now trying to get the thread IDS and the thread names from the actual sections, take here for an example:
http://forum.tibia.com/forum/?action=board&boardid=8527

But some of them say Page 1, page 2, etc and it keeps printing out that part

I am trying to make it just print the ID but it is also including this stuff:
2446049&pagenumber=1

So I used strpos to check if the & was there but it didnt seem to work (sometiems said it was there when it wasnt, sometimes said it wasnt there when it was)

Do you know how I can spit the topics up easily?

Thanks, im new to this :o
Post Reply