Page 1 of 2

[SOLVED] Help with removing bbCode

Posted: Wed Jun 16, 2004 7:01 pm
by johnperkins21
I'm pulling out the first 100 characters of my topic texts from my forum to put into a link title. What I'm trying to do now is to make sure that I strip out all of the bbCode.

This is what I have so far:

Code: Select all

<?php
$data[$i]['topic_text'] = str_replace(chr(13),' ',addslashes(htmlentities($data[$i]['topic_text'])));
$data[$i]['topic_text'] = str_replace(chr(10),' ',$data[$i]['topic_text']);
$data[$i]['topic_text'] = substr($data[$i]['topic_text'], 0, 100);
?>
Now this doesn't even come close to what I'm doing so some of my titles look like this:

[b:af1a041a54]SECTION 1 GENERAL OVERVIEW[/b:af1a041a54]

I want to get rid of [b:af1a041a54] and [/b:af1a041a54].

Can anyone point me in the right direction? I've looked on php.net but I just don't get the regular expressions at all.

Thanks,
John

Posted: Wed Jun 16, 2004 7:13 pm
by dull1554
well is the bb tag always gonna be the same?
if so yes its not too hard, but if the tag changes then it might be a little harder, not to comfy with regexp

Posted: Wed Jun 16, 2004 7:53 pm
by johnperkins21
No, it's probably going to change. I just want to delete any [] and whatever is between the []. There has to be a way, I'm just no good with regex at all.

I'm thinking I should find a [ then count how many characters until ], and then delete that many chars starting at the [. But I'm lost as to how to actually do that.

Posted: Wed Jun 16, 2004 7:58 pm
by markl999
I'm no regex expert either, but try:

Code: Select all

$data[$i]['topic_text'] = preg_replace('/(\[.[^\]]*\])/', '', $data[$i]['topic_text']);

Posted: Wed Jun 16, 2004 8:05 pm
by johnperkins21
Mark, you are da man. I'm gonna have to try and dissect that thing and figure out how it works, but it works.

Thank you very much.

Posted: Wed Jun 16, 2004 8:35 pm
by dull1554
it went just as far over your head as it did mine.....

Posted: Wed Jun 16, 2004 8:44 pm
by tim
untested

Code: Select all

<?php
preg_replace('/(\[.+?\])/', '', $data[$i]['topic_text']);
?>

Posted: Wed Jun 16, 2004 8:45 pm
by feyd
posts_text stores the bbcode_uid field... this will give you the key to getting all of the bbcode removed..
untested...

Code: Select all

<?php

$stripStartBB = array('b','i','url=[^]]*?','email'); // add the other ones you want to strip here...
$stripEndBB = array('b','i','url','email');

for($x = sizeof($stripStartBB)-1; $x > 0; $x--)
{
  $startTag =& $stripStartBB[$x];
  $endTag =& $stripEndBB[$x];
  $search = '#\['.$startTag.'(:\d+)?:'.$bbcode_uid.'](.*?)\[/'.$endTag.'(:\d+)?:'.$bbcode_uid.']#is';
  $text = preg_replace($search, '\\2', $text);
}
?>
$bbcode_uid comes from the bbcode_uid field in the table, and $text is the post text :o

;) questions?

Posted: Wed Jun 16, 2004 8:47 pm
by tim
feyd wrote:;) questions?
yeah, how does it work? lol

:P 8) :P

Posted: Wed Jun 16, 2004 8:58 pm
by feyd
looks for [one of the tags to stripphpBB nested tags :<some number>:bbcode user id]all text in between[/end tag that corresponds to the start one current searched forphpBB nested tags :<some number>:bbcode user id]

kinda complicated sounding.. but looks for the specific tags phpBB uses..

the underscored bits may or may not exist, depending on the nesting status...

Posted: Wed Jun 16, 2004 9:01 pm
by tim
well all tags start with [ and end with ]

therefore:
preg_replace('/(\[.+?\])/', '', $data[$i]['topic_text']); or marks solution would work.

Posted: Wed Jun 16, 2004 9:02 pm
by feyd
it may accidently strip out actual text though.. so be careful.

Posted: Wed Jun 16, 2004 9:04 pm
by tim
very true, but who would enclose text in those brakets?

hmmm, maybe I would.

indeed feyd has made another excellent point

Posted: Wed Jun 16, 2004 10:37 pm
by feyd
okay.. I spent a little time reversing how phpBB decodes the "standard" tags... add your custom tags as needed..

Code: Select all

<?php

//ini_set('display_errors','1');
//error_reporting(E_ALL);

mysql_connect('xxx','xxx','xxx') or die(mysql_error());
mysql_select_db('xxx') or die(mysql_error());

$query = mysql_query('SELECT * FROM phpbb_posts_text p') or die(mysql_error());

header('Content-type: text/plain');

while($row = mysql_fetch_assoc($query))
{
	$uid =& $row['bbcode_uid'];
	$text =& $row['post_text'];
	$text = html_entity_decode($text,ENT_QUOTES);
	$tags = array(
		'#\[/?s:'.$uid.'\]#',
		'#\[/?u:'.$uid.'\]#',
		'#\[/?i:'.$uid.'\]#',
		'#\[/?b:'.$uid.'\]#',
		'#\[/?quote:'.$uid.'\]#',
		'#\[quote:'.$uid.'=".*?"\]#s',
		'#\[list:'.$uid.'\]#',
		'#\[\*:'.$uid.'\]#',
		'#\[/list:[uo]:'.$uid.'\]#',
		'#\[list=[aAiI1]:'.$uid.'\]#',
		'#\[color=(\#[0-9A-F]{6}|[a-z]+):'.$uid.'\]#',
		'#\[/color:'.$uid.'\]#',
		'#\[size=[1-2]?[0-9]:'.$uid.'\]#',
		'#\[/size:'.$uid.'\]#',
		'#\[/?img:'.$uid.'\]#',
		'#\[/?url[^]]*?\]#',
		'#\[/?email]#',
		'#\[/?code:1:'.$uid.']#',
	);
	
	$text = preg_replace($tags, '', $text);
	
	$entities_search = array('<', '>', '"', '&#58;', '&#91;', '&#93;', '&#40;', '&#41;', '&#123;', '&#125;');
	$entities_replace = array('<', '>', '"', ':', '[', ']', '(', ')', '{', '}');
	$text = str_replace($entities_search, $entities_replace, $text);

	echo $text."\n\n\n";
}
?>

Posted: Wed Jun 16, 2004 11:27 pm
by johnperkins21
Wow, I am in awe as to how little I actually know about this stuff. I was just looking for a way to put a short line of the post text into a title so people could read a little before clicking the link, and you guys come up with this.

I have learned so much from this site... thanks guys.

Does anyone know of a good place to find more information on regular expressions? The tutorial on php.net doesn't seem that great. All those /\('\\/[]//')/ things are just way over my head.