Page 2 of 2

Posted: Fri May 21, 2004 10:43 am
by redmonkey
Can you post the unmodified contents of $string? (perhaps within code tags to retain any formatting).

Posted: Fri May 21, 2004 10:46 am
by JayBird
this is what is contained in $string

This is all 1 line

Code: Select all

ext="#000000" leftmargin="0" topmargin="0" rightmargin="0" bottommargin="0" marginwidth="0" marginheight="0"> <center> <?php print "it works"; ?> <table width="100%" border="0" cellspacing="0" cellpadding="0"> <tr valign="top"> <td rowspan="2" width="65"><img src="logo.gif" width="65" height="52"></td> <td align="center"> <table width="100%" border="0" cellspacing="0" cellpadding="0"> <tr> <td background="head_back.gif"><img src="TreeBlank.gif" width="45" height="45"></td> <td width="100%" align="center" valign="middle" background="head_back.gif" nowrap> <span class="ReportTitle">Report for ipd-Eurochem: </span> <span class="CategoryTitle">General Statistics</span> </td> <td width="112"><a href="http://www.weblogexpert.com/" target="_blank"><img src="powered.gif" width="112" height="45" border="0"></a></td> </tr> </table> </td> </tr> <tr> <td> <table width="100%" border="0" cellspacing="0" cellpadding="0" height="7"> <tr><td background="top_line.gif"></td></tr> </table> </td> </tr> </table> <table width="90%" border=0 cellpadding=1 cellspacing=1> <tr> <td valign="top" align="left">Time range: 13/05/2004 09:31:44 - 20/05/2004 21:03:29</td> <td valign="top" align="right">Generated on Wed Apr 21, 2004 - 10:38:15</td> </tr> </table> <table width="100%" border="0" cellspacing="0" cellpadding="0"> <tr><td height="7"></td></tr> <tr><td height="7" background="top_line.gif"></td></tr> </table> <br> <a name="Summary"></a> <table cellpadding="0" border="0" cellspacing="0" width="90%"> <tr> <td width="10"><img src="section_left.gif" width="10" height="20" border="0"></td> <td class="SectionTitle" nowrap>Summary</td> <td width="10"><img src="section_right.gif" width="10" height="20" border="0"></td> </tr> </table> <p></p> <span class="TableTitle">Summary</span><br> <table border=0 cellspacing=0 cellpadding=0 height=6><tr><td></td></tr></table> <table border=0 bgcolor="#000000" cellspacing=0 cellpadding=0 width="90%"> <tr> <td> <table border=0 cellspacing=1 cellpadding=2 width="100%"> <tr class="TableSolidRow"> <td colspan="2" class="TableCell">Hits</td> </tr> <tr class="TableRow1"> <td width="100%" class="TableCell">Total Hits</td> <td width="0%" class="TableCell">3,304</td> </tr> <tr class="TableRow2"> <td class="TableCell">Average Hits per Day</td> <td class="TableCell">413</td> </tr> <tr class="TableRow1"> <td class="TableCell">Average Hits per Visitor</td> <td class="TableCell">34.42</td> </tr> <tr class="TableRow2"> <td class="TableCell">Cached Requests</td> <td class="TableCell">517</td> </tr> <tr class="TableRow1"> <td class="TableCell">Failed Requests</td> <td class="TableCell">0</td> </tr> <tr class="TableSolidRow"> <td colspan="2" class="TableCell">Page Views</td> </tr> <tr class="TableRow1"> <td class="TableCell">Total Page Views</td> <td class="TableCell">60</td> </tr> <tr class="TableRow2"> <td class="TableCell">Average Page Views per Day</td> <td class="TableCell">7</td> </tr> <tr class="TableRow1"> <td class="TableCell">Average Page Views per Visitor</td> <td class="TableCell">0.63</td> </tr> <tr class="TableSolidRow"> <td colspan="2" class="TableCell">Visitors</td> </tr> <tr class="TableRow1"> <td class="TableCell">Total Visitors</td> <td class="TableCell">96</td> </tr> <tr class="TableRow2"> <td class="TableCell">Average Visitors per Day</td> <td class="TableCell">12</td> </tr> <tr class="TableRow1"> <td class="TableCell">Total Unique IPs</td> <td class="TableCell">85</td> </tr> <tr class="TableSolidRow"> <td colspan="2" class="TableCell">Bandwidth</td> </tr> <tr class="TableRow1"> <td class="TableCell">Total Bandwidth</td> <td class="TableCell">5.56&nbsp;MB</td> </tr> <tr class="TableRow2"> <td class="TableCell">Average Bandwidth per Day</td> <td class="TableCell">711.23&nbsp;KB</td> </tr> <tr class="TableRow1"> <td class="TableCell">Average Bandwidth per Hit</td> <td class="TableCell">1.72&nbsp;KB</td> </tr> <tr class="TableRow2"> <td class="TableCell">Average Bandwidth per Visitor</td> <td class="TableCell">59.27&nbsp;KB</td> </tr> </table> </td> </tr> </table> <p> <br> <p>&nbsp</p> </center> </body> </html>
Mark

Posted: Fri May 21, 2004 10:53 am
by redmonkey
Bech100 wrote: This is all 1 line
Christ!

Try this...

Code: Select all

if (preg_match('/<td>\s+(<table b.*<\/table>)\s+<\/td>/is', $string, $matches))

Posted: Fri May 21, 2004 10:54 am
by JayBird
gimme a sec, ill try that, you can answer this in the mean time

What is the "\x0a" in this line for

Code: Select all

echo $matches[1] . "\x0a";

Posted: Fri May 21, 2004 10:57 am
by redmonkey
Is there a reason why that string starts with 'ext="#000000"' and doesn't seem to conatin the whole HTML file?

\x0a is actually just the same as \n I just use it to return a new line after the HTML code incase you want to add anything else after it. It makes for better formatting of the source so everything is not on one line.

Posted: Fri May 21, 2004 10:59 am
by JayBird
redmonkey wrote:Is there a reason why that string starts with 'ext="#000000"' and doesn't seem to conatin the whole HTML file?
I think i just missed a bit whilst copying and pasting.

So, what changed to make the new preg_match patern work?

Mark

Posted: Fri May 21, 2004 11:06 am
by redmonkey
Bech100 wrote: So, what changed to make the new preg_match patern work?
It take it it works OK then?

The main differences are that the original source you posted seemed to be over multiple lines, therefore my original regex would not find anything as it was looking for a specific piece of the string at the start of a line.

Also, the original source you had did not have any spaces between the <td> and <table> tags but your one liner source did.

Posted: Fri May 21, 2004 11:08 am
by JayBird
ah, i c.

Thanks a lot buddy.

I have got a load of other files i need parsing, so working from your example, hopefully i will start to learn regex in more detail.

You know what, i think you deserver another

Image

Mark

Posted: Fri May 21, 2004 11:24 am
by redmonkey
LOL two in one thread! Thanks.

For what it's worth, I do alot of file analysis with regex and I find it better to start with quite restrictive/specific regex patterns (.* comes in handy but many people overuse it and pull in all sorts of unwanted stuff. Also using ^ and $ can help alot in narrowing down your search.) and then relax the pattern if/as needed.

If you have any further questions/difficulties, feel free to ask (either in a thread or PM).

Posted: Fri May 21, 2004 11:26 am
by JayBird
redmonkey wrote:If you have any further questions/difficulties, feel free to ask (either in a thread or PM).
Cheerz, i may just take you up on that. I'll have a bash myself, but WHEN i get stuck, i'll holler at ya :)

Mark