Parse HTML - ereg?

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
acalder
Forum Newbie
Posts: 2
Joined: Tue Jun 17, 2003 6:04 pm

Parse HTML - ereg?

Post by acalder »

I would like to do the following:

1. Have a form that allows the user to specify a URL to process (this part I can do already)
2. Go get the specified URL, scan for certain tags (in this case, SPAN tags with specific IDs) that are located within a whole pile of other HTML
3. Take the content from inside those tags and output a pipe seperated list.

I know what IDs I am looking for on the target page, so I can specify what they are within the script I would think. Each ID only appears once, so there would be no multiple matches.

An example of the data:

<OTHERHTML>....
<SPAN ID="Country">Guyana</SPAN>
<OTHERHTML>....
<SPAN ID="FlagName">Soaring Cross</SPAN>
<OTHERHTML>....
<SPAN ID="FlagDate">Incorporated in June 1755</SPAN>
<OTHERHTML>....

An example of what I would like to output:

Guyana|Soaring Cross|Incorporated in June 1755

I would imagine I can use fopen to get the URL, and then ereg to match the IDs I am looking for, but I am unsure how to get the information inside the SPAN tags into a string (or strings) so that I can print them to the screen.

Thanks,
acalder
m3rajk
DevNet Resident
Posts: 1191
Joined: Mon Jun 02, 2003 3:37 pm

Post by m3rajk »

if you know the perl regular expressions, something i'm a bit more familliar with, you can use them and forego the php ones... preg_match('/string/', $variable);

on that note, anyone know if you have to put a \ in front of - or @ in a preg match?
User avatar
patrikG
DevNet Master
Posts: 4235
Joined: Thu Aug 15, 2002 5:53 am
Location: Sussex, UK

Post by patrikG »

\
acalder
Forum Newbie
Posts: 2
Joined: Tue Jun 17, 2003 6:04 pm

Okay, so how do you actually DO that?

Post by acalder »

Hey there,

Thanks for the reply, but I guess I should have been more clear:

How do I actually go about doing that? I have no idea how to acquire the file, how to iterate through the HTML that I retrieve, or how I dump everything into a string in order to throw it to the screen.

I am a newbie at php, so when I look at this example:

preg_match('/string/', $variable);

It makes me think that for every span it encounters, $variable will be replaced...no?

acalder
Post Reply