Page 1 of 1

Hello and help please! (url parsing)

Posted: Fri Feb 04, 2005 4:35 pm
by TeMPeST2000
Hi,

Looks like I found a decent forum. I'm an ad-hoc php developer of several years now, but some simple things still evade me, so I hope someone can help fill in the blanks. :)

This problem has been bugging me for a while now and I can't find a simple solution anywhere:

Just like this forum, I am using phpbb. Now, there is a specific url which people enter on my site, lets call it just.some.com. I want this url parsed so that it goes through a forwarder of my choice.

So for example the url the user enters is:

http://just.some.com/file.php
I want it to end up on the forum as:
http://www.myforward.net/forward.php?ht ... m/file.php

I obviously need to use a regular expression, but I've tried various things and I just can't get it to work properly.

I have located the url parsing area within bbcode.php in phpbb:

Code: Select all

/**
 * Rewritten by Nathan Codding - Feb 6, 2001.
 * - Goes through the given string, and replaces xxxx://yyyy with an HTML <a> tag linking
 * 	to that URL
 * - Goes through the given string, and replaces www.xxxx.yyyy&#1111;zzzz] with an HTML <a> tag linking
 * 	to http://www.xxxx.yyyy&#1111;/zzzz]
 * - Goes through the given string, and replaces xxxx@yyyy with an HTML mailto: tag linking
 *		to that email address
 * - Only matches these 2 patterns either after a space, or at the beginning of a line
 *
 * Notes: the email one might get annoying - it's easy to make it more restrictive, though.. maybe
 * have it require something like xxxx@yyyy.zzzz or such. We'll see.
 */
function make_clickable($text)
&#123;

	// pad it with a space so we can match things at the start of the 1st line.
	$ret = ' ' . $text;

	// matches an "xxxx://yyyy" URL at the start of a line, or after a space.
	// xxxx can only be alpha characters.
	// yyyy is anything up to the first space, newline, comma, double quote or <
	$ret = preg_replace("#(^|&#1111;\n ])(&#1111;\w]+?://&#1111;^ "\n\r\t<]*)#is", "\\1<a href="\\2" target="_blank">\\2</a>", $ret);


	// matches a "www|ftp.xxxx.yyyy&#1111;/zzzz]" kinda lazy URL thing
	// Must contain at least 2 dots. xxxx contains either alphanum, or "-"
	// zzzz is optional.. will contain everything up to the first space, newline, 
	// comma, double quote or <.
	$ret = preg_replace("#(^|&#1111;\n ])((www|ftp)\.&#1111;^ "\t\n\r<]*)#is", "\\1<a href="http://\\2" target="_blank">\\2</a>", $ret);

	// matches an email@domain type address at the start of a line, or after a space.
	// Note: Only the followed chars are valid; alphanums, "-", "_" and or ".".
	$ret = preg_replace("#(^|&#1111;\n ])(&#1111;a-z0-9&\-_.]+?)@(&#1111;\w\-]+\.(&#1111;\w\-\.]+\.)*&#1111;\w]+)#i", "\\1<a href="mailto:\\2@\\3">\\2@\\3</a>", $ret);

	// Remove our padding..
	$ret = substr($ret, 1);

	return($ret);
&#125;
I have tried all sorts of preg_replace functions, similar to the url parsing line in the code above, but nothing works.

Could anyone please give me a definitive solution for this? It will save me a lot of headache and I'm sure this is fairly simple.

Thanks,
T.

Posted: Fri Feb 04, 2005 4:46 pm
by Burrito
if you want to pass it as a url var, couldn't you just grab the address they're on using $_SERVER['vars'] and then append it to the redirect string?

ex:

they hit: http://www.mysite.com

you want them here:

http://www.yoursite.com

you append the url string with somethign like:

http://www.yoursite.com?ol=http://www.mysite.com

Posted: Fri Feb 04, 2005 4:52 pm
by TeMPeST2000
thanks, but thats not quite the situation...

This is a url in a forum post which the user has entered...
The url is in $ret, so I have been trying even something as simple as :

Code: Select all

$ret = preg_replace("just.some.com","unga.bunga.com/forward.php?http://just.some.com", $ret);
But this seems to be causing some strange problems and parsing every url. :(

I need a correct preg_replace line, similar to the one in the bbcode.php in my first post...

Posted: Fri Feb 04, 2005 5:05 pm
by Burrito
ahh I think I understand now.

you'll definately need some regular expressions to do that if you're catching it after bbcode has done its thing, you could search for the "href="" and replace everything between it and the "> with whatever you want.

I'm definately not a reg expression pro, but I knwo you'll need something *like* this:

$rep = "#<\s(a href="|A href=")\s+.*?>#si";
$yourstring = preg_replace($rep,"<\\1yourdomain.com>",$yourstring);

I'd be VERY surprised if that works...but at least if should give you an idea.

Posted: Fri Feb 04, 2005 5:15 pm
by Burrito
I tried what I posted...and no go so I did a lil' work and came up with this:

$yourstring = "<a href=\"http://www.mydomain.com\">here</a>";
$rep = "#<\s*(a)\s+.*?>#si";
$yourstring = preg_replace($rep,"<\\1 href=\"http://www.blab.com\">",$yourstring);
echo $yourstring;

it's not exactly what you want, but it's better than my last post :P

Posted: Fri Feb 04, 2005 5:25 pm
by TeMPeST2000
Thanks for trying.. it's almost what I need but I only want one specific domain to be changed... all the rest I'd like to be left alone..

so http://www.any.com stays http://www.any.com....
but I want http://www.specificdomain.com to change to http://www.mydomain.com/forward.php?www ... domain.com

I am at my reg exp knowledge limit im afraid.. I can almost see the line I want but just cant make it work! :(

Posted: Fri Feb 04, 2005 5:41 pm
by feyd
untested

Code: Select all

$ret = preg_replace("#(^|&#1111;\n ])(&#1111;\w]+?://&#1111;^ "\n\r\t<]*)#ise", '''\\1<a href="http://www.yourdomain.com/redirect.php?'' . urlencode(''\\2'') . ''" target="_blank">\\2</a>''', $ret);
   $ret = preg_replace("#(^|&#1111;\n ])((www|ftp)\.&#1111;^ "\t\n\r<]*)#ise", '''\\1<a href="http://www.yourdomain.com/redirect.php?'' . urlencode(''http://\\2'') . ''" target="_blank">\\2</a>', $ret);

Posted: Fri Feb 04, 2005 5:52 pm
by TeMPeST2000
Thanks feyd that looks like its getting there but... I can't see how it is recognising the specificdomain.com. Looks like that code wraps the forwarder url around every url, but I only want to do it for one domain.

Posted: Fri Feb 04, 2005 6:01 pm
by feyd
tell it to call a function or use preg_replace_callback then.

create this function so that you can check whether the url is for that domain or not...