help needed in getting item and subitem

Any questions involving matching text strings to patterns - the pattern is called a "regular expression."

Moderator: General Moderators

Post Reply
balishake
Forum Newbie
Posts: 1
Joined: Sat Oct 04, 2008 6:03 am

help needed in getting item and subitem

Post by balishake »

Hi everyone,

I'm new here and need your help.

I have this HTML code:
<span option=\"Item"><b>Item1</b></span>\r\n
<span option=\"SubItem"><BR>\r\n&#149;&nbsp;subItem 1<BR>\r\n&#149;&nbsp;subItem 2<BR>\r\n</span><BR>\r\n<BR>\r\n

<span option=\"Item"><b>Item12</b></span>\r\n
<span option=\"SubItem"><BR>\r\n&#149;&nbsp;subItem 1<BR>\r\n&#149;&nbsp;subItem 2<BR>\r\n&#149;&nbsp;subItem 3<BR>\r\n</span><BR>\r\n<BR>\r\n

<span option=\"Item"><b>ITEM33</b></span>\r\n
<span option=\"SubItem"><BR>\r\n&#149;&nbsp;subItem 1<BR>\r\n</span><BR>\r\n<BR>\r\n

<span option=\"Item"><b>Long ITEM 14</b></span>\r\n
<span option=\"SubItem"><BR>\r\n&#149;&nbsp;subItem 1<BR>\r\n</span><BR>\r\n<BR>\r\n

<span option=\"Item"><b>Maybe Item 13</b></span>\r\n
<span option=\"SubItem"><BR>\r\n&#149;&nbsp;sub Item 1<BR>\r\n</span><BR>\r\n<BR>\r\n

I need to extract the Item between the <b></b> (the Item) and the subItem between 149;&nbsp; and <br>
Problem is I need it so I can receive an output in expresso software in the form of example:

Item1
SubItem1
SubItem2
...
Item2
SubItem1
SubItem2

I have the code for taking out each one but it they are not together I can connect the subitems with the items...

Here is the Regex for the items: <b>(?<Data>.*?)</b>
Here is the one for the subItems: 149;&nbsp;(?<Data1>.*?)<br> <--- note that I can't seperate them into groups, I get each Item in a seperate line...

Many thanks in advance.
User avatar
prometheuzz
Forum Regular
Posts: 779
Joined: Fri Apr 04, 2008 5:51 am

Re: help needed in getting item and subitem

Post by prometheuzz »

Code: Select all

$text = '<span option=\"Item"><b>Item1</b></span>\r\n
<span option=\"SubItem"><BR>\r\n&#149;&nbsp;subItem 1<BR>\r\n&#149;&nbsp;subItem 2<BR>\r\n</span><BR>\r\n<BR>\r\n
 
<span option=\"Item"><b>Item12</b></span>\r\n
<span option=\"SubItem"><BR>\r\n&#149;&nbsp;subItem 1<BR>\r\n&#149;&nbsp;subItem 2<BR>\r\n&#149;&nbsp;subItem 3<BR>\r\n</span><BR>\r\n<BR>\r\n
 
<span option=\"Item"><b>ITEM33</b></span>\r\n
<span option=\"SubItem"><BR>\r\n&#149;&nbsp;subItem 1<BR>\r\n</span><BR>\r\n<BR>\r\n
 
<span option=\"Item"><b>Long ITEM 14</b></span>\r\n
<span option=\"SubItem"><BR>\r\n&#149;&nbsp;subItem 1<BR>\r\n</span><BR>\r\n<BR>\r\n
 
<span option=\"Item"><b>Maybe Item 13</b></span>\r\n
<span option=\"SubItem"><BR>\r\n&#149;&nbsp;sub Item 1<BR>\r\n</span><BR>\r\n<BR>\r\n';
 
if(preg_match_all('/(?si)<b>([^<]*)<\/b>(?:(?:(?!<b>).)*?(?<=&#149;&nbsp;)([^<]*)(?=<br>))+/i', $text, $matches)) {
    foreach($matches[0] as $match) {
        if(preg_match_all('/(?<=&#149;&nbsp;|<b>)[^<]*/i', $match, $item)) {
            print_r($item);
        }
    }
}
/*
Array
(
    [0] => Array
        (
            [0] => Item1
            [1] => subItem 1
            [2] => subItem 2
        )
 
)
Array
(
    [0] => Array
        (
            [0] => Item12
            [1] => subItem 1
            [2] => subItem 2
            [3] => subItem 3
        )
 
)
Array
(
    [0] => Array
        (
            [0] => ITEM33
            [1] => subItem 1
        )
 
)
Array
(
    [0] => Array
        (
            [0] => Long ITEM 14
            [1] => subItem 1
        )
 
)
Array
(
    [0] => Array
        (
            [0] => Maybe Item 13
            [1] => sub Item 1
        )
 
)
*/
Post Reply