[SOLVED] Help with removing bbCode

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

User avatar
johnperkins21
Forum Contributor
Posts: 140
Joined: Mon Oct 27, 2003 4:57 pm

[SOLVED] Help with removing bbCode

Post by johnperkins21 »

I'm pulling out the first 100 characters of my topic texts from my forum to put into a link title. What I'm trying to do now is to make sure that I strip out all of the bbCode.

This is what I have so far:

Code: Select all

<?php
$data[$i]['topic_text'] = str_replace(chr(13),' ',addslashes(htmlentities($data[$i]['topic_text'])));
$data[$i]['topic_text'] = str_replace(chr(10),' ',$data[$i]['topic_text']);
$data[$i]['topic_text'] = substr($data[$i]['topic_text'], 0, 100);
?>
Now this doesn't even come close to what I'm doing so some of my titles look like this:

[b:af1a041a54]SECTION 1 GENERAL OVERVIEW[/b:af1a041a54]

I want to get rid of [b:af1a041a54] and [/b:af1a041a54].

Can anyone point me in the right direction? I've looked on php.net but I just don't get the regular expressions at all.

Thanks,
John
Last edited by johnperkins21 on Wed Jun 16, 2004 8:06 pm, edited 1 time in total.
User avatar
dull1554
Forum Regular
Posts: 680
Joined: Sat Nov 22, 2003 11:26 am
Location: 42:21:35.359N, 76:02:20.688W

Post by dull1554 »

well is the bb tag always gonna be the same?
if so yes its not too hard, but if the tag changes then it might be a little harder, not to comfy with regexp
User avatar
johnperkins21
Forum Contributor
Posts: 140
Joined: Mon Oct 27, 2003 4:57 pm

Post by johnperkins21 »

No, it's probably going to change. I just want to delete any [] and whatever is between the []. There has to be a way, I'm just no good with regex at all.

I'm thinking I should find a [ then count how many characters until ], and then delete that many chars starting at the [. But I'm lost as to how to actually do that.
User avatar
markl999
DevNet Resident
Posts: 1972
Joined: Thu Oct 16, 2003 5:49 pm
Location: Manchester (UK)

Post by markl999 »

I'm no regex expert either, but try:

Code: Select all

$data[$i]['topic_text'] = preg_replace('/(\[.[^\]]*\])/', '', $data[$i]['topic_text']);
User avatar
johnperkins21
Forum Contributor
Posts: 140
Joined: Mon Oct 27, 2003 4:57 pm

Post by johnperkins21 »

Mark, you are da man. I'm gonna have to try and dissect that thing and figure out how it works, but it works.

Thank you very much.
User avatar
dull1554
Forum Regular
Posts: 680
Joined: Sat Nov 22, 2003 11:26 am
Location: 42:21:35.359N, 76:02:20.688W

Post by dull1554 »

it went just as far over your head as it did mine.....
User avatar
tim
DevNet Resident
Posts: 1165
Joined: Thu Feb 12, 2004 7:19 pm
Location: ohio

Post by tim »

untested

Code: Select all

<?php
preg_replace('/(\[.+?\])/', '', $data[$i]['topic_text']);
?>
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

posts_text stores the bbcode_uid field... this will give you the key to getting all of the bbcode removed..
untested...

Code: Select all

<?php

$stripStartBB = array('b','i','url=[^]]*?','email'); // add the other ones you want to strip here...
$stripEndBB = array('b','i','url','email');

for($x = sizeof($stripStartBB)-1; $x > 0; $x--)
{
  $startTag =& $stripStartBB[$x];
  $endTag =& $stripEndBB[$x];
  $search = '#\['.$startTag.'(:\d+)?:'.$bbcode_uid.'](.*?)\[/'.$endTag.'(:\d+)?:'.$bbcode_uid.']#is';
  $text = preg_replace($search, '\\2', $text);
}
?>
$bbcode_uid comes from the bbcode_uid field in the table, and $text is the post text :o

;) questions?
Last edited by feyd on Wed Jun 16, 2004 8:59 pm, edited 1 time in total.
User avatar
tim
DevNet Resident
Posts: 1165
Joined: Thu Feb 12, 2004 7:19 pm
Location: ohio

Post by tim »

feyd wrote:;) questions?
yeah, how does it work? lol

:P 8) :P
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

looks for [one of the tags to stripphpBB nested tags :<some number>:bbcode user id]all text in between[/end tag that corresponds to the start one current searched forphpBB nested tags :<some number>:bbcode user id]

kinda complicated sounding.. but looks for the specific tags phpBB uses..

the underscored bits may or may not exist, depending on the nesting status...
Last edited by feyd on Wed Jun 16, 2004 9:01 pm, edited 1 time in total.
User avatar
tim
DevNet Resident
Posts: 1165
Joined: Thu Feb 12, 2004 7:19 pm
Location: ohio

Post by tim »

well all tags start with [ and end with ]

therefore:
preg_replace('/(\[.+?\])/', '', $data[$i]['topic_text']); or marks solution would work.
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

it may accidently strip out actual text though.. so be careful.
User avatar
tim
DevNet Resident
Posts: 1165
Joined: Thu Feb 12, 2004 7:19 pm
Location: ohio

Post by tim »

very true, but who would enclose text in those brakets?

hmmm, maybe I would.

indeed feyd has made another excellent point
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

okay.. I spent a little time reversing how phpBB decodes the "standard" tags... add your custom tags as needed..

Code: Select all

<?php

//ini_set('display_errors','1');
//error_reporting(E_ALL);

mysql_connect('xxx','xxx','xxx') or die(mysql_error());
mysql_select_db('xxx') or die(mysql_error());

$query = mysql_query('SELECT * FROM phpbb_posts_text p') or die(mysql_error());

header('Content-type: text/plain');

while($row = mysql_fetch_assoc($query))
{
	$uid =& $row['bbcode_uid'];
	$text =& $row['post_text'];
	$text = html_entity_decode($text,ENT_QUOTES);
	$tags = array(
		'#\[/?s:'.$uid.'\]#',
		'#\[/?u:'.$uid.'\]#',
		'#\[/?i:'.$uid.'\]#',
		'#\[/?b:'.$uid.'\]#',
		'#\[/?quote:'.$uid.'\]#',
		'#\[quote:'.$uid.'=".*?"\]#s',
		'#\[list:'.$uid.'\]#',
		'#\[\*:'.$uid.'\]#',
		'#\[/list:[uo]:'.$uid.'\]#',
		'#\[list=[aAiI1]:'.$uid.'\]#',
		'#\[color=(\#[0-9A-F]{6}|[a-z]+):'.$uid.'\]#',
		'#\[/color:'.$uid.'\]#',
		'#\[size=[1-2]?[0-9]:'.$uid.'\]#',
		'#\[/size:'.$uid.'\]#',
		'#\[/?img:'.$uid.'\]#',
		'#\[/?url[^]]*?\]#',
		'#\[/?email]#',
		'#\[/?code:1:'.$uid.']#',
	);
	
	$text = preg_replace($tags, '', $text);
	
	$entities_search = array('<', '>', '"', '&#58;', '&#91;', '&#93;', '&#40;', '&#41;', '&#123;', '&#125;');
	$entities_replace = array('<', '>', '"', ':', '[', ']', '(', ')', '{', '}');
	$text = str_replace($entities_search, $entities_replace, $text);

	echo $text."\n\n\n";
}
?>
User avatar
johnperkins21
Forum Contributor
Posts: 140
Joined: Mon Oct 27, 2003 4:57 pm

Post by johnperkins21 »

Wow, I am in awe as to how little I actually know about this stuff. I was just looking for a way to put a short line of the post text into a title so people could read a little before clicking the link, and you guys come up with this.

I have learned so much from this site... thanks guys.

Does anyone know of a good place to find more information on regular expressions? The tutorial on php.net doesn't seem that great. All those /\('\\/[]//')/ things are just way over my head.
Post Reply