Page 1 of 1

A very hard one, but almost there!

Posted: Fri Nov 23, 2007 7:49 am
by Frexuz
feyd | Please use

Code: Select all

,

Code: Select all

and [syntax="..."] tags where appropriate when posting code. Your post has been edited to reflect how we'd like it posted. Please read:  [url=http://forums.devnetwork.net/viewtopic.php?t=21171]Posting Code in the Forums[/url] to learn how to do it too.[/color]


Here's the HTML code i want to grab values from: (this code loops over and over a couple of times)

[syntax="html"]
      <tr height="30">
        <td colspan="3" class="tier1" align="center" style="border-right: none 1px #000;">Starting Out Small (<a class="tierlink" href="rankings.php?game=5&diff=4&song=-36&page=16&highlight=778">Rank: 778th</a>) </td>
        <td colspan="7" class="tier2" style="border-left: none 1px #000;">&nbsp;</td>
      </tr>
      <tr height="25">
        <td>
          <table cellspacing="0" cellpadding="0" align="center">
            <tr><td align="center"><a href="javascript:openWindow('view_scores.php?user=22553&song=1412')"><span style="font-size: 0.7em">View All</span></a></td></tr>
          </table>
        </td>
        <td align="center">3</td><td align="center">Slow Ride</td><td align="center"><a href="rankings.php?game=5&diff=4&song=1412&page=18&highlight=892" class="rank3">892nd</a></td>
        <td align="center">192,601</td>
        <td align="center"><img src="/images/rating_6.gif" /> (6.7)</td>
        <td align="center"><span class="percent2">99%</span></td>
        <td align="center">267</td>
        <td align="center">Nov. 18, 2007, 3:19PM</td>
        <td align="center"><span class="gray">N/A</span>
    </tr>
      <tr height="25">
        <td>
          <table cellspacing="0" cellpadding="0" align="center">
            <tr><td align="center"><a href="javascript:openWindow('view_scores.php?user=22553&song=1428')"><span style="font-size: 0.7em">View All</span></a></td></tr>
          </table>
        </td>
        <td align="center">3</td><td align="center">Talk Dirty to Me</td><td align="center"><a href="rankings.php?game=5&diff=4&song=1428&page=15&highlight=709" class="rank3">709th</a></td>
        <td align="center">298,732</td>
        <td align="center"><img src="/images/rating_6.gif" /> (6.7)</td>
        <td align="center"><span class="percent2">99%</span></td>
        <td align="center">358</td>
        <td align="center">Nov. 18, 2007, 3:32PM</td>
        <td align="center"><span class="gray">N/A</span>
    </tr>
      <tr height="25">
        <td>
          <table cellspacing="0" cellpadding="0" align="center">
            <tr><td align="center"><a href="javascript:openWindow('view_scores.php?user=22553&song=1444')"><span style="font-size: 0.7em">View All</span></a></td></tr>
          </table>
        </td>
        <td align="center">3</td><td align="center">Hit Me With Your Best Shot</td><td align="center"><a href="rankings.php?game=5&diff=4&song=1444&page=18&highlight=868" class="rank3">868th</a></td>
        <td align="center">166,643</td>
        <td align="center"><img src="/images/rating_6.gif" /> (6.6)</td>
        <td align="center"><span class="percent2">97%</span></td>
        <td align="center">181</td>
        <td align="center">Nov. 18, 2007, 3:39PM</td>
        <td align="center"><span class="gray">N/A</span>
    </tr>
      <tr height="25">
        <td>
          <table cellspacing="0" cellpadding="0" align="center">
            <tr><td align="center"><a href="javascript:openWindow('view_scores.php?user=22553&song=1460')"><span style="font-size: 0.7em">View All</span></a></td></tr>
          </table>
        </td>
        <td align="center">2</td><td align="center">Story of My Life</td><td align="center"><a href="rankings.php?game=5&diff=4&song=1460&page=19&highlight=914" class="rank3">914th</a></td>
        <td align="center">357,933</td>
        <td align="center"><img src="/images/rating_6.gif" /> (6.5)</td>
        <td align="center"><span class="percent2">99%</span></td>
        <td align="center">211</td>
        <td align="center">Nov. 10, 2007, 4:44PM</td>
        <td align="center"><span class="gray">N/A</span>
    </tr>
      <tr height="25">
        <td>
          <table cellspacing="0" cellpadding="0" align="center">
            <tr><td align="center"><a href="javascript:openWindow('view_scores.php?user=22553&song=1476')"><span style="font-size: 0.7em">View All</span></a></td></tr>
          </table>
        </td>
        <td align="center">2</td><td align="center">Rock and Roll All Nite</td><td align="center"><a href="rankings.php?game=5&diff=4&song=1476&page=17&highlight=815" class="rank3">815th</a></td>
        <td align="center">162,133</td>
        <td align="center"><img src="/images/rating_6.gif" /> (6.0)</td>
        <td align="center"><span class="percent2">94%</span></td>
        <td align="center">161</td>
        <td align="center">Nov. 8, 2007, 12:11PM</td>
        <td align="center"><span class="gray">N/A</span>
    </tr>
Explained:
There are 5 chunks of code that are similar. This i have already solved with this regEx:

C#

Code: Select all

new Regex(@"\('view_scores.php\?user=.*?&song=.*?'\)"">[\S\s]*?<td align=""center"">.*?</td>[\S\s]*?<td align=""center"">(.*?)</td>[\S\s]*?<td align=""center"">(.*?)</td>[\S\s]*?<td align=""center"">(.*?)</td>[\S\s]*?<td align=""center"">(.*?)</td>[\S\s]*?<td align=""center"">(.*?)</td>[\S\s]*?<td align=""center"">(.*?)</td>", RegexOptions.Multiline);
This will give me for example, these values:[/syntax]

Code: Select all

Rock and Roll All Nite
815th
162,133
<img src="/images/rating_6.gif" /> (6.0)
<span class="percent2">94%</span>
161
-------------------------------------------------------------------------------------------------------
Now to my problem.
I also want to grab Starting Out Small from this: (first piece of code from the HTML above)

Code: Select all

      <tr height="30">
        <td colspan="3" class="tier1" align="center" style="border-right: none 1px #000;">Starting Out Small (<a class="tierlink" href="rankings.php?game=5&diff=4&song=-36&page=16&highlight=778">Rank: 778th</a>) </td>
        <td colspan="7" class="tier2" style="border-left: none 1px #000;">&nbsp;</td>
      </tr>
and i want to join it with my current regEx pattern. (This is the hard part) :)

I dont think there is another way, BECAUSE there are NOT always 5 chunks of code between each of these "headers".
So i cant really grab these headers separately.
-------------------------------------------------------------------------------------------------------

If you dont understand, ill try to explain better :)

PS, this is of course not my site, so i cant change to code or anything.


feyd | Please use

Code: Select all

,

Code: Select all

and [syntax="..."] tags where appropriate when posting code. Your post has been edited to reflect how we'd like it posted. Please read:  [url=http://forums.devnetwork.net/viewtopic.php?t=21171]Posting Code in the Forums[/url] to learn how to do it too.[/color]

Posted: Fri Nov 23, 2007 9:31 am
by superdezign
Why does it all need to be done by one pattern...?

Also, since you are dealing with an HTML document, I'd find away to simplify the data you are retrieving, or use the DOM.

By simplify the data you are getting, I mean something like this:

Code: Select all

$htmlData = preg_replace('/<(table|tr|td).*>/', '<\\1>', $htmlData);
That would make it more manageable to read, and then you might also be able to find a better way of simplifying the data with regex down to a point where you could basically have what you want without needing to extract it.

Posted: Fri Nov 23, 2007 9:53 am
by Frexuz
The reason why i want them as one is that I want my matchCollection to look something like this:

Group[0] (the header, only 1 value)
Starting Out Small

Group[1] (song-item, 6 values)
{Slow Ride , 892nd , 192,601 , 6.7 , 99% , 267}

Group[2] (song-item, 6 values)
{Talk Dirty to Me , 709th , 298,732 , 6.7 , 99% , 358}

Group[3] (song-item, 6 values)
{Hit Me With Your Best Shot , 868th , 166,643 , 6.6 , 97% , 181}

Group[4] (song-item, 6 values)
{Story of My Life , 914th , 357,933 , 6.5 , 99% , 211}

Group[5] (song-item, 6 values)
{Rock and Roll All Nite , 815th , 162,133 , 6.0 , 94% , 161}

........... then it continues with a new Header, and then a couple of song-items
the problem is, if i grab the headers separately, i dont know what position they have, or to say: how many songs they are holding

basically i want to reproduce this page:
http://www.scorehero.com/scores.php?use ... e=5&diff=4

and i cannot have static values for the headers, cause the lists are changing from user to user, and from game to game..

maybe this will clear it up a little?
I'll try the DOM, but right now i dont understand it :)

Posted: Fri Nov 23, 2007 10:20 am
by superdezign
Just create an array of headers and an array of sections, and then print them out according to their index, which should correspond to one another.

Posted: Fri Nov 23, 2007 10:27 am
by Frexuz
But I have no idea when a group of songs ends or starts?
Im just matching each song right now

Posted: Fri Nov 23, 2007 10:32 am
by superdezign
Well, you do know that you can match more than one pattern with an pipe, right?

Code: Select all

/(pattern1|pattern2)/
I assumed you'd want some sort of organization and hierarchy to your data, but if it's just the data that you want and that's it, then get it.

Posted: Fri Nov 23, 2007 10:51 am
by Frexuz

Code: Select all

/(pattern1|pattern2)/
Thank you, that works nicely!