I would like to do the following:
1. Have a form that allows the user to specify a URL to process (this part I can do already)
2. Go get the specified URL, scan for certain tags (in this case, SPAN tags with specific IDs) that are located within a whole pile of other HTML
3. Take the content from inside those tags and output a pipe seperated list.
I know what IDs I am looking for on the target page, so I can specify what they are within the script I would think. Each ID only appears once, so there would be no multiple matches.
An example of the data:
<OTHERHTML>....
<SPAN ID="Country">Guyana</SPAN>
<OTHERHTML>....
<SPAN ID="FlagName">Soaring Cross</SPAN>
<OTHERHTML>....
<SPAN ID="FlagDate">Incorporated in June 1755</SPAN>
<OTHERHTML>....
An example of what I would like to output:
Guyana|Soaring Cross|Incorporated in June 1755
I would imagine I can use fopen to get the URL, and then ereg to match the IDs I am looking for, but I am unsure how to get the information inside the SPAN tags into a string (or strings) so that I can print them to the screen.
Thanks,
acalder
Parse HTML - ereg?
Moderator: General Moderators
Okay, so how do you actually DO that?
Hey there,
Thanks for the reply, but I guess I should have been more clear:
How do I actually go about doing that? I have no idea how to acquire the file, how to iterate through the HTML that I retrieve, or how I dump everything into a string in order to throw it to the screen.
I am a newbie at php, so when I look at this example:
preg_match('/string/', $variable);
It makes me think that for every span it encounters, $variable will be replaced...no?
acalder
Thanks for the reply, but I guess I should have been more clear:
How do I actually go about doing that? I have no idea how to acquire the file, how to iterate through the HTML that I retrieve, or how I dump everything into a string in order to throw it to the screen.
I am a newbie at php, so when I look at this example:
preg_match('/string/', $variable);
It makes me think that for every span it encounters, $variable will be replaced...no?
acalder