Page 1 of 1
special string reversal
Posted: Sun Nov 27, 2005 9:33 am
by jasongr
Hello people
I am facing a problem relating to the usage of regular expressions and would apprciate any help
I am given a string
The string may contain html tags inside
For example:
"The big dog<br>jump<i>over</i>the fance"
I need to write a code that reverses that string but it should be done in an intelligent manner,
preserving the tags
The reversed string would then be:
"ecnaf eht<i>revo</i>pmuj<br>god gib ehT"
That is, tags, should NOT be reversed, but regular text should
For starters, I would be content with a more simplified solution that only takes take of <br> tags
Can anyone direct me to a general direction here, possibly using regular expressions
to search for the html tags, extracting them (remembering where in the code they were)
then reversing the string, and then planting the tags back
thanks
Posted: Sun Nov 27, 2005 4:26 pm
by Ambush Commander
I don't think is possible with regexps. You'll have to do a stack based parser. Hmm... here's what I would do. Make a bunch of strings, and then hand figure out their outputs. Then, using some testing suite, hook them up, and try developing a function that will do it.
If I had time, I'd take a whack at it, but perhaps this link will inspire you:
viewtopic.php?t=36790
Posted: Mon Nov 28, 2005 1:58 am
by redmonkey
I pondered this for 5-10mins before I got bored (or maybe it was scared as I know what headaches lie within when you start shifting round HTML).
Anyway before getting bored I came up with this...
Code: Select all
<?php
$str = "The big dog<br>jump<i>over</i>the fence";
$parts = preg_split('/([<|>])/', $str, -1, PREG_SPLIT_DELIM_CAPTURE);
$single_tags = array('area', 'br', 'hr', 'img', 'embed');
for ($i = 0, $limit = count($parts); $i < $limit; $i++)
{
if (!($i % 4))
{
$parts[$i] = strrev($parts[$i]);
if (isset($parts[$i - 1])) $parts[$i - 1] = '<';
if (isset($parts[$i + 1])) $parts[$i + 1] = '>';
}
elseif (!($i % 2))
{
if ($parts[$i]{0} == '/')
{
$parts[$i] = substr($parts[$i], 1);
}
elseif (!in_array(strtolower(array_shift(explode(' ', $parts[$i]))), $single_tags))
{
$parts[$i] = '/' . $parts[$i];
}
}
}
$str = implode('', array_reverse($parts));
echo $str;
?>
Which if works as intended (I haven't tested it) should deal with basic HTML strings. Where it all falls down is when the string includes a tag which has both open and close and the open tag contains additional attributes. It's quite a simple as pulling the HTML out and putting it back where it came from as inyour example the entire string is reversed which means that all the HTML is also shifted around.
Depends on how much complexity is required, if you don't have to handle nesting etc... it may not be as bad as I imagine.
Thinking about it, you may be able to cater for quite a few of those cases in a similar way to the single tag array.
Posted: Mon Nov 28, 2005 2:17 am
by nincha
if its simple html, just replace the tags with special words that is the same if reverse. Once the string is reverse, replace the special words with the tags. This can be done easily witih arrays and php str_replace function, *i think*
For instance:
my dog <b> jump </b> out of the <u> window</u> of an airplane.
replace key:
<b> -> <!civic!>
</b> -> <!%civic%!>
<u> -> <!dad!>
</u> -> <!%dad%!>
string with replace tags:
my dog <!civic!> jump <!%civic%!> out of the <!dad!> window<!%dad%!> of an airplane.
reverse string with replace tags:
.enalpria na fo <!%dad%!> wodinw <!dad!> eht fo tuo <!%civic%!> pmuj <!civic!> god ym
reverse string with normal tags
.enalpria na fo <b> wodinw </b> eht fo tuo <u> pmuj </u> god ym
Posted: Mon Nov 28, 2005 2:28 am
by redmonkey
Or for simple HTML you can do that ^^^, which is probably more efficient. Although there is probably is no need for the initial special tags as you could simply just str_replace the reversed tags for their normal. i.e. <rb> -> <br>, <b/> -> <b> and <b> -> </b> etc, the only small caveat I can think of off the top of my head would be the order of str_replace routine i.e. replace the <b> tag first then the <b/> etc..