Parsing a html site and using some content ?
Moderator: General Moderators
Parsing a html site and using some content ?
hello I've been looking for a sample for a couple of hours now not to mention few months before also but never found waht i needed always find some more advanced things but can't find the basics and i'm not too advanced in php yet so hopefully you guys can help
i'm trying to parse a server status page to tell if it's up or down so i need something to get the html code use it to find some string and use that lline and few lines after it or if possible selecting the <tr> range i've done similar things on batch script for html so thats where the idea is maybe in php it's much different if so give me an idea at least,
thank you all ideas/sugjustions welcome
i'm trying to parse a server status page to tell if it's up or down so i need something to get the html code use it to find some string and use that lline and few lines after it or if possible selecting the <tr> range i've done similar things on batch script for html so thats where the idea is maybe in php it's much different if so give me an idea at least,
thank you all ideas/sugjustions welcome
Code: Select all
$doc = file_get_contents('http://foo.org');
sscanf($doc, "Status: %s", $status);
echo $status;if the document was:
<html>
<table>
<td> blah foo bar Status: online blah blah fooo
</html>
the above code will work, and echo $status would output "online"
- John Cartwright
- Site Admin
- Posts: 11470
- Joined: Tue Dec 23, 2003 2:10 am
- Location: Toronto
- Contact:
- John Cartwright
- Site Admin
- Posts: 11470
- Joined: Tue Dec 23, 2003 2:10 am
- Location: Toronto
- Contact:
i am looking to parse http://lobby.soldat.pl:13073/index.html
and get all the details where there's 66.17.183.250:65000 in that table to get all player # everything like that but it's just something i want to learn some php with also so if you got time to parse some one thing in that format wiould be great to start off for me , i never really parsed anything with php so far
and get all the details where there's 66.17.183.250:65000 in that table to get all player # everything like that but it's just something i want to learn some php with also so if you got time to parse some one thing in that format wiould be great to start off for me , i never really parsed anything with php so far
now, if you remove all the \r and \n and \s+ the matching should go quite easily.
to help you find the correct regular expression, you can use:
http://www.samuelfullman.com/team/php/t ... ster_p.php
to help you find the correct regular expression, you can use:
http://www.samuelfullman.com/team/php/t ... ster_p.php
sry but one more quastion btw that exp tester is good but i have
/<a href=\"soldat:\/\/66.17.183.250:65000\/\"><font color=\"#79E958\"><b>\|Optik's Server\|<\/b><\/font><\/a><\/td>/i
and i want to match \|Optik's Server\| without writing in the name so it could be dynamic, i've been trying to find specifier list or soemething i could use also tried [a-z''\|] but no luck kind of lost is there a list somewhere of what i could be using or such ?
also for the before post you said remove \r \n \s+ would i be using preg_replace for that ?
/<a href=\"soldat:\/\/66.17.183.250:65000\/\"><font color=\"#79E958\"><b>\|Optik's Server\|<\/b><\/font><\/a><\/td>/i
and i want to match \|Optik's Server\| without writing in the name so it could be dynamic, i've been trying to find specifier list or soemething i could use also tried [a-z''\|] but no luck kind of lost is there a list somewhere of what i could be using or such ?
also for the before post you said remove \r \n \s+ would i be using preg_replace for that ?
that's in one of the links on the page so it's only in the source code not on the visual page so i need to take the info by my servers' ip:port and then get the details about it so if i run few servers it would also work and i could change name and etc.. and still would go ok, kind of trying to make one for other users also so it would be universal and you'd only need ip:port of your serverrehfeld wrote:i cant find "66.17.183.250:65000" anyway on that page.
is it only going to appear sometimes?
could you pick the name of something thats actually on the page, and then give us examples of what you want to parse out of it?
- John Cartwright
- Site Admin
- Posts: 11470
- Joined: Tue Dec 23, 2003 2:10 am
- Location: Toronto
- Contact:
untested (plus I dont know much about regrx )
Code: Select all
/<a href="soldat:\/\/66.17.183.250:65000\/"><font color="#79E958"><b>\(їA-Za-z]+)<\/b><\/font><\/a><\/td>/iCode: Select all
/<a href="soldat:\/\/66.17.183.250:65000\/"><font color="#79E958"><b>(.*?)<\/b><\/font><\/a><\/td>/ii said that because it would allow you to matchsomething likealso for the before post you said remove \r \n \s+ would i be using preg_replace for that ?
<tr><td>(.*?)</td><td>(.*?)</td>.....</tr>
need some more help hehe i figured out how to get the string that i wanted that contains all the info i need i thought this would be easier than parsing all separetly so just get the table i want and then parse that part less cpu usage too i guess anyways so i got a string now need to find out how to parse it when i have it lost once more so if you could help would be good
returns something like
and wondering how i could take data from it like '|Optik'sServer|' from it or any other because i tried other way before it gets me the whole string so when i echo it it justs adds the string which isn't really good so would like to parse out exact data then format it as wanted but don't really know how i can specify the place where it should be but not sure how to use it
Code: Select all
/<ahref="soldat:\/\/66.17.183.250:65000\/"><fontcolor="#79E958"><b>(.*?)<\/b><\/font><\/a><\/td><tdwidth="37\%">(.*?)<\/td><tdwidth="8\%">(.*?)<\/td><tdwidth="14\%">(.*?)<\/td><tdwidth="12\%">(.*?)\/(.*?)<\/td><tdwidth="8\%">(.*?)<\/td>/iCode: Select all
<ahref="soldat://66.17.183.250:65000/"><fontcolor="#79E958"><b>|Optik'sServer|</b></font></a></td><tdwidth="37%"></td><tdwidth="8%">CTF</td><tdwidth="14%">ctf_Dropdown</td><tdwidth="12%">0/12</td><tdwidth="8%">1.2.1*</td>