Page 1 of 1

String in one tag?

Posted: Sun Jul 23, 2006 9:43 am
by MarK (CZ)
Let's say I have an unknown string. Now, I want to find out, whether it's in one tag or not, like this:
"<b>This is bold text</b>"

but not like this:
"<b>This is bold text</b> and this is not"
or
"<b>This is bold text</b><i>and this is not</i>"

How should I find out? Regex?

Posted: Sun Jul 23, 2006 9:44 am
by feyd
regex, yes.

Posted: Sun Jul 23, 2006 10:06 am
by MarK (CZ)
I'm still not sure how.

I have this:

Code: Select all

if (mb_ereg("^<([a-zA-Z]+).+</(.+)>$", $item, $parts) &&
    $parts[1] == $parts[2])
but that doesn't fix a string like this:
"<b>This is bold text</b> and <b>this is bold too</b>"

Posted: Sun Jul 23, 2006 10:11 am
by feyd
The really basic, fairly dumb, PCRE version:

Code: Select all

#^<\s*([a-zA-Z]+)[^>]*>.*?<\s*\\1[^>]*>$#

Posted: Sun Jul 23, 2006 10:18 am
by MarK (CZ)
That one doesn't work.. Why is there the '\\1'?

Posted: Sun Jul 23, 2006 10:40 am
by feyd
The tags must match.

I forgot a tiny bit:

Code: Select all

<?php

$pattern = '#^<\s*([a-zA-Z]+)[^>]*>.*?<\s*/\s*\\1[^>]*>$#s';
$tests = array(
	'<b>This is bold text</b>' => true,
	'<b>This is bold text</b> and this is not' => false,
	'<b>This is bold text</b><i>and this is not</i>' => false,
);

$results = array();

foreach($tests as $test => $result)
{
	$results[] = (preg_match($pattern, $test) == $result);
}

if (count(array_filter($results)) == count($results))
{
	echo 'it works.';
}
else
{
	echo 'it doesn\'t work.';
	var_dump($results);
}

?>

Posted: Sun Jul 23, 2006 12:40 pm
by MarK (CZ)
That is similar to what I've done...

However, your still fails for these:

Code: Select all

'<h1>This is bold text</h2>' => false,
        '<b>This is bold text</b><b>and this is not</b>' => false,
First one is just forgeting about numbers in tags (h1-h6)
but for the second one, mine fails too. I would probably have to parse it as xml and than test it like that, I don't see any way how to fix that via regex.

Posted: Sun Jul 23, 2006 12:43 pm
by feyd
You can figure out the first one. The second is a little harder to handle, but not impossible, in a single regex. I won't be writing it however. Keep tinkering around.

Posted: Sun Jul 23, 2006 2:07 pm
by MarK (CZ)
Yeah, the first one was no problem of course but I can't see any solution to the second one via regex :?

Posted: Sun Jul 23, 2006 2:10 pm
by feyd
The quick and dirty one I see involves checking to see if the original string matches the matching string. You'd need to remove both anchors for this to work.