Page 1 of 1

Parse HTML - ereg?

Posted: Tue Jun 17, 2003 6:04 pm
by acalder
I would like to do the following:

1. Have a form that allows the user to specify a URL to process (this part I can do already)
2. Go get the specified URL, scan for certain tags (in this case, SPAN tags with specific IDs) that are located within a whole pile of other HTML
3. Take the content from inside those tags and output a pipe seperated list.

I know what IDs I am looking for on the target page, so I can specify what they are within the script I would think. Each ID only appears once, so there would be no multiple matches.

An example of the data:

<OTHERHTML>....
<SPAN ID="Country">Guyana</SPAN>
<OTHERHTML>....
<SPAN ID="FlagName">Soaring Cross</SPAN>
<OTHERHTML>....
<SPAN ID="FlagDate">Incorporated in June 1755</SPAN>
<OTHERHTML>....

An example of what I would like to output:

Guyana|Soaring Cross|Incorporated in June 1755

I would imagine I can use fopen to get the URL, and then ereg to match the IDs I am looking for, but I am unsure how to get the information inside the SPAN tags into a string (or strings) so that I can print them to the screen.

Thanks,
acalder

Posted: Tue Jun 17, 2003 7:29 pm
by m3rajk
if you know the perl regular expressions, something i'm a bit more familliar with, you can use them and forego the php ones... preg_match('/string/', $variable);

on that note, anyone know if you have to put a \ in front of - or @ in a preg match?

Posted: Tue Jun 17, 2003 7:48 pm
by patrikG
\

Okay, so how do you actually DO that?

Posted: Wed Jun 18, 2003 1:45 am
by acalder
Hey there,

Thanks for the reply, but I guess I should have been more clear:

How do I actually go about doing that? I have no idea how to acquire the file, how to iterate through the HTML that I retrieve, or how I dump everything into a string in order to throw it to the screen.

I am a newbie at php, so when I look at this example:

preg_match('/string/', $variable);

It makes me think that for every span it encounters, $variable will be replaced...no?

acalder