Page 1 of 1

Extracting Info from HTML and storing in variables

Posted: Wed Aug 31, 2005 3:49 pm
by jayshields
Hi guys,

I have taken on a job of adding some features to an already existing website. The website in question shows content pages by content.php?pid=xxx (with xxx being the page id number). This then uses GET to pull all the page info, such as title, content, footer, etc, from a MySQL Database, using the pid to select the correct row in the table.

My problem lies here:

The owner adds car details in the content field by writing out the HTML for each entry each time, which he has to paste from a local file. So, this is too time consuming and he has asked me to make a page for him to add cars to these pages using input boxes for each single entry, so he doesnt have to mess with the HTML anymore. He also edits/removes the cars, which is where my main problem lies.

So, to explain better, each content field for each page (row in the sql table) now consists of something like this:

Code: Select all

'<b><u>Fiat Seicento SX</b></u><br>
<a href="http://www.yourdomain.co.uk/images/cars/IMAG0028.jpg"><img src="http://www.yourdomain.co.uk/images/cars/IMAG0028.jpg" width="300 height="300"></a><br>
<sup>Click to enlarge</sup><br>
<b>Year:</b> N/A<br>
<b>Reg:</b> S<br>
<b>Details:</b><br>
899cc<br>
3 door<br>
27k miles<br>
Sunroof<br>
Metallic paintwork<br>
<br>
<b>Price: £1,995</b><br>
<br>
<br>
<b><u>Fiat Punto S 1.2</b></u><br>
<a href="http://www.yourdomain.co.uk/images/cars/24IMAG0006.jpg"><img src="http://www.yourdomain.co.uk/images/cars/24IMAG0006.jpg" width="300" height="225" border="0"></a><br>
<sup>Click to enlarge</sup><br>
<b>Year:</b> N/A<br>
<b>Reg:</b> R<br>
<b>Details:</b><br>
5 Door<br>
CD Player<br>
Sunroof<br>
<br>
<b>Price: £1,495</b>
So, adding a cars details to this wouldnt prove so hard, as i could build a big string adding the html as i go, but viewing/editting the details would prove a problem.

I would need it to look at a full content field (which would look like the above), and then show me each detail in seperate boxes (also seperate variables).

So I'm thinking I would need some sort of code to select everything between the <'s and the >'s and the brackets themselves, and delete them, and as it deletes each one, add whats before it to a variable.

I hope you understand what I mean and have some ideas for me.

Thanks alot.

Posted: Wed Aug 31, 2005 4:21 pm
by feyd
if possible, I'd change the table structure (after extracting all existing pages) into a new table structure where each part is a seperate field in the table. Then you build an rendering script that combines the HTML with the various fields returned from the database just before sending the page to the user.

Posted: Wed Aug 31, 2005 4:29 pm
by s.dot
If every one of the records in the database was the same, I'd store the whole thing into one variable

Code: Select all

$string = $database['record'];

// Replace <br>s with commas
$newstring = str_replace("<br>",",",$string);

// Strip the HTML tags
$nohtml = strip_tags($newstring);

// Explode at the comma
$final = explode(",",$nohtml);
Then if you do a print_r on $final, you'll see that all of the pieces of the string are in an array, ready for populating a form, perhaps.

Edit: I just thought that this could impose a problem, if there were commas in the original data, which isn't unlikely. Just replace the <BR> with something less common, like "\", then explode at the "\"

Posted: Wed Aug 31, 2005 5:11 pm
by jayshields
the first idea is an idea i had myself, but since there are around 200 cars in the database at present, and the same table structure is used for other pages, such as contact info and location details, it would mean leaving the table with the exisiting pages, making a new one just for cars then making a new page just to show cars, i would also have to change all of the existing templating system.

i will have a play around with what you said though scrotaye, that is a nice idea, replacing the <br>'s, then stripping the tags, then making an array from it using the char you replaced <br> with.

i havent used strip_tags before, I will have to look into what it does exactly and how to use it, if it does the obvious and just removes every html tag, your solution seems perfect.

i am currently working on the add car bit, which shouldnt be too hard, but i will have to make an extra column in the table, called counter, put a number in it for how many car entries are there, and then when adding a new car, add one to the number, and then when adding a car, if that number is a multiple of 3, add <!--pagebreak--> to the end of the HTML for the cars details to be added.

after ive done the add car bit, i will do the edit and delete car bit, which is when i will get to test what you said.

ill get back to you on my progress...

thanks alot guys.

edit: thinking about it in more detail, it will be harder to edit/delete a car, because the user will need to select which car to edit/delete, so i would have to code something that would maybe search a string and find a certain point, and then delete from there to another certain point. ARGH, but then how would it know where to move the <!--pagebreak--> to? This is going to be impossible...

Posted: Wed Aug 31, 2005 5:44 pm
by s.dot
Doesn't sound impossible :P

Editing/deleting a car won't be a problem. I would have a page like 'editrecord.php?record=123' that will pull all the information about the car from the database. 123 could be replaced with whatever car you want to edit the record for (surely there's some kind of ID or unique field in the database for each car)

For the editing part, populating your form with pieces of the array you get should be easy, which would make modifying the record fairly simple.

Also, deletion of that record should be just as easy. With a button or form that deletes the record from the database.... Something along the lines of this...

Code: Select all

<?
// Page url is http://www.domain.com/editrecord.php?record=123

// Get all the info from the database for specified record (assuming mysql)

$result = mysql_query("SELECT * FROM table WHERE id = '".$_GET['record']."'");
if(mysql_num_ros($result) < 1)
{
   die("This record does not exist.");
}
$array = mysql_fetch_assoc($result);

// Clarify which part of that record is the HTML formatted field
$htmlformattedfield = $array['info'];

// Replace <BR> with a less commonly used character
$nobr = str_replace("<br>","/",$htmlformattedfield);

// Strip HTML tags (feyd has written a good regex for this) although this may be okay for your example
$nohtml = strip_tags($nobr);

// Explode at the / so you have all of your pieces in an array
$pieces = explode("/",$nohtml);

// You now have all the lines of the entry in $pieces[0], $pieces[1] etc

?>

<form action="<? echo $_SERVER['php_self']; ?>" method="post">
Car Title: <input type="text" name="title" size="20" value="<? echo $pieces[0]; ?>">
Car Year: <input type="text" name="year" size="20" value="<? echo $pieces[1]; ?>">
etc......

Posted: Thu Sep 01, 2005 11:06 am
by jayshields
yeah, i know that would be not too hard, but there is a <!--pagebreak--> after every 3 cars, so when you delete one, all the pagebreak's after it will need to be moved.

it gets too complicated, im just gunna remake the database and do it properly.

thanks for the help though dude.

Posted: Thu Sep 01, 2005 11:45 am
by s.dot
Well you could do that...

or..

Code: Select all

$i=0;
foreach($cars AS $car)
{
  // Check to see if a pagebreak is needed
  if($i == 3)
  {
    // page break here
    $i=0;
  }

  // Display car information
  $i++;
}

Posted: Thu Sep 01, 2005 4:11 pm
by jayshields
thanks again but i have already started redesigning the database now :)