regexp problem - strip everything except y between x and z

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
dewaard
Forum Newbie
Posts: 7
Joined: Fri Aug 15, 2003 9:15 pm

regexp problem - strip everything except y between x and z

Post by dewaard »

Guys,
I'm stuck at a nasty regexp problem:

Code: Select all

//strip anything between 2 blocks
//<TML> dewaard: '@\&#1111;/block\].*?\&#1111;block=@is'
$input = preg_replace("/\&#1111;\/block\].*?\&#1111;block=/is","&#1111;/block]&#1111;block=", $input);
This code is ment to newlines and other nasty stuff between two [block] tags, but should remove the [break-blocks] tag. So I want to remove anything except [break-blocks] between [/block] and [block=...

Any suggestions appreciated.
User avatar
greenhorn666
Forum Commoner
Posts: 87
Joined: Thu Aug 14, 2003 7:14 am
Location: Brussels, Belgium

Post by greenhorn666 »

A sample of a "raw" input would be great...
- Not quite sure here, but shouldn't you loop thru all lines?
- Is there something before and after the [block]?

Code: Select all

$input = ereg_replace(".*\[block\](.*)\[\/block\].*", "", $input)
But I'm not sure I got your point
User avatar
greenhorn666
Forum Commoner
Posts: 87
Joined: Thu Aug 14, 2003 7:14 am
Location: Brussels, Belgium

Post by greenhorn666 »

DUH!
Noooo... that's not it... :P
Could you give me a before and a wished after transformation please?
It should be something like

Code: Select all

$input = ereg_replace("(.*\[block\]).*(\(\/block\].*)", "\\1\\2", $input);
That would transform
I love my [block] green [/block] fish
into
I love my green [block] [/block] fish

Is that what you need?
dewaard
Forum Newbie
Posts: 7
Joined: Fri Aug 15, 2003 9:15 pm

Post by dewaard »

Thanks for your quick reply...

Code: Select all

&#1111;block=10%]16-08-03&#1111;/block]
&#1111;block=15%]Willem II - AZ :-)&#1111;/block]
&#1111;block=5%]1-0&#1111;/block]
&#1111;block=5%]&#1111;Details]&#1111;/block]&#1111;break-blocks] <- this one is getting stripped but is essential
&#1111;block=10%]23-08-03&#1111;/block]
&#1111;block=15%]PSV - Willem II&#1111;/block]&#1111;break-blocks]
This my UBB like code to create divs that are floating next to each other and [break-blocks] adds a div w/ 'clear: both'. The problem is that any newlines/rubbish between [/block] and [block= messes up the layout so I need to remove that. The first few blocks are displayed well and break perfectly but the second [break-blocks] is stripped by the regexp is posted and thus it doesn't break the previous blocks...

Example: http://www.linuxaddict.nl/cms/index.php ... a0304&id=6

So I have to strip everything except the '[break-blocks]' tag. Stripping everyting worked, but I don't know how to keep the [break-blocks]' tag intact....
will
Forum Contributor
Posts: 120
Joined: Fri Jun 21, 2002 9:38 am
Location: Memphis, TN

Post by will »

this will simply remove any newline characters between '[/block]' and '[block=' while leaving everythign else intact.

Code: Select all

$newstr = preg_replace("/\&#1111;\/block\](&#1111;^\n]*)\n\&#1111;block=/is","&#1111;/block]\\1&#1111;block=", $str);
or use this to remove everything between those two tags except [break-blocks]

Code: Select all

$newstr = preg_replace("/\[\/block\].*(\[break-blocks\])?.*\[block=/iUs","[/block]\\1[block=", $str);

these two will do the same thing in the example you provided, but will not behave the same in all cases. if you'd like more clarification on that (not to insult your intelligence, just not sure how much you know about regexes), i can explain more.
?>
Last edited by will on Wed Aug 20, 2003 5:02 am, edited 2 times in total.
User avatar
greenhorn666
Forum Commoner
Posts: 87
Joined: Thu Aug 14, 2003 7:14 am
Location: Brussels, Belgium

Post by greenhorn666 »

You regexp just leaving a (if any) [break-blocks] between blocks

Code: Select all

preg_replace("/\[\/block\].*?(\[break-blocks\])?.*?\[block=/is","[/block]\\1[block=", $input);
will
Forum Contributor
Posts: 120
Joined: Fri Jun 21, 2002 9:38 am
Location: Memphis, TN

Post by will »

greenhorn666 wrote:You regexp just leaving a (if any) [break-blocks] between blocks

Code: Select all

preg_replace("/\[\/block\].*?(\[break-blocks\])?.*?\[block=/is","[/block]\\1[block=", $input);
you don't need the question mark after ".*" since it means zero or more... you will also need the U modifier to make preg_repalce "ungreedy" (someone discusses it in the user comments in the manual page for preg_replace)
User avatar
greenhorn666
Forum Commoner
Posts: 87
Joined: Thu Aug 14, 2003 7:14 am
Location: Brussels, Belgium

Post by greenhorn666 »

I thought so too,
But I copy the one dewaard pasted in his post... :P
I use posix anyhow, wasn't sure about perl's extensions ;)
dewaard
Forum Newbie
Posts: 7
Joined: Fri Aug 15, 2003 9:15 pm

Post by dewaard »

great, it's working now. Thanks guys.

Code: Select all

<?php
$input = preg_replace("/\[\/block\].*?(\[break-blocks\])?.*?\[block=/is","[/block]\\1[block=", $input);
?>
Is this the proper/most efficient way to do it? At least it works, which is a relieve :)
will
Forum Contributor
Posts: 120
Joined: Fri Jun 21, 2002 9:38 am
Location: Memphis, TN

Post by will »

dewaard wrote:great, it's working now. Thanks guys.

Code: Select all

<?php
$input = preg_replace("/\[\/block\].*?(\[break-blocks\])?.*?\[block=/is","[/block]\\1[block=", $input);
?>
Is this the proper/most efficient way to do it? At least it works, which is a relieve :)
yep, except that you don't really need the extra question-marks (although they don't really hurt anything either)
m3rajk
DevNet Resident
Posts: 1191
Joined: Mon Jun 02, 2003 3:37 pm

Post by m3rajk »

with perl you can change delimiters to make it easier. use % in place of / and you don;t need to escape /

also, you don't need to escape ]
dewaard
Forum Newbie
Posts: 7
Joined: Fri Aug 15, 2003 9:15 pm

Post by dewaard »

You can also do that w/ PHP:

Code: Select all

//strip anything between 2 blocks
$input = preg_replace("@\[\/block\].*?(\[break-blocks\])?.*?\[block=@is","[/block]\\1[block=", $input);
Notice the @? EDIT: well, not exactly 'that', but it makes it easier anyway.

I wasn't able to remove any question marks, this didn't work. Which :?: can be removed?
m3rajk
DevNet Resident
Posts: 1191
Joined: Mon Jun 02, 2003 3:37 pm

Post by m3rajk »

dewaard wrote:You can also do that w/ PHP:

Code: Select all

//strip anything between 2 blocks
$input = preg_replace("@\[\/block\].*?(\[break-blocks\])?.*?\[block=@is","[/block]\\1[block=", $input);
Notice the @? EDIT: well, not exactly 'that', but it makes it easier anyway.

I wasn't able to remove any question marks, this didn't work. Which :?: can be removed?
you would have to escape ? in the string... ie: use \?
will
Forum Contributor
Posts: 120
Joined: Fri Jun 21, 2002 9:38 am
Location: Memphis, TN

Post by will »

dewaard wrote:You can also do that w/ PHP:

Code: Select all

//strip anything between 2 blocks
$input = preg_replace("@\[\/block\].*?(\[break-blocks\])?.*?\[block=@is","[/block]\\1[block=", $input);
Notice the @? EDIT: well, not exactly 'that', but it makes it easier anyway.

I wasn't able to remove any question marks, this didn't work. Which :?: can be removed?

the first and third one are not needed, unless you intend for them to match a literal '?', which i doubt (if that is the case however, they need to be escaped as the the previous post explains). a question mark matches zero or one of the preceeding blocks, whether it is a single character or parenthetical set. therefore the middle qmark is needed because the [break-blocks] text may or may not be present.

an asterick (*) matches zero or more instances of the preceding block. since it allows for zero instances, the question mark is not needed. you may be thinking of a plus sign (+) which matches one or more instances... in which case you would need the qmark (although i'm still not sure if that would actually work... never tried).
Post Reply