Page 1 of 1
make relative urls not relative
Posted: Fri Jan 23, 2009 3:17 am
by shiznatix
As part of this RSS feed I am writing I want to turn all relative URLs to non-relative URLs but am of course having trouble (otherwise, why would I post

)
This is my matching regex (just starting with matching before I move on to the preg_replace
Code: Select all
preg_match('#<a href="([^http://www\.domain\.com].*)">#', $text, $matches);
What I want is if a link does not start with
http://www.domain.com for it to return that in $matches but of course what it is doing now is if it contains any of those letters it won't return, I want it to be like "starts with" instead of "contains anywhere". How do I do this?
Re: make relative urls not relative
Posted: Fri Jan 23, 2009 4:01 am
by prometheuzz
shiznatix wrote:As part of this RSS feed I am writing I want to turn all relative URLs to non-relative URLs but am of course having trouble (otherwise, why would I post

)
This is my matching regex (just starting with matching before I move on to the preg_replace
Code: Select all
preg_match('#<a href="([^http://www\.domain\.com].*)">#', $text, $matches);
What I want is if a link does not start with
http://www.domain.com for it to return that in $matches but of course what it is doing now is if it contains any of those letters it won't return, I want it to be like "starts with" instead of "contains anywhere". How do I do this?
Everything between '[' and ']' (also called a character class, or character set) will match just a single character. So, this part of your expression:
[^http://www\.domain\.com]
will match any (single!) character except: 'h', 't', 'p', ':', '/', 'w', '.', 'd', 'o', 'm', 'a', 'i', 'n', and a 'c'.
What you're looking for is probably something like this:
Code: Select all
<?php
$text = 'text <a href="http://www.domain.com/foo">foo</a>
text <a href="/foo2">foo2</a> more text to ignore
text <a href="http://www.domain.com/bar">bar</a>
text <a href="bar2">bar2</a> and this is the end.';
echo '<pre>';
echo $text;
echo '</pre>';
if(preg_match_all('@(?<=<a\shref=")(?!http://)[^"]+@', $text, $matches)) {
echo '<pre>';
print_r($matches);
echo '</pre>';
}
?>
Re: make relative urls not relative
Posted: Fri Jan 23, 2009 4:07 am
by prometheuzz
... but if you just want to change the paths, why first match them? You could simply change them at once using preg_replace(...):
Code: Select all
$text = preg_replace('@(?<=<a\shref=")(?!http://)/?([^/][^"]+)@', 'http://www.domain.com/$1', $text);
(untested!)
Re: make relative urls not relative
Posted: Fri Jan 23, 2009 4:16 am
by shiznatix
yessir thats what i needed. super thanks
Re: make relative urls not relative
Posted: Fri Jan 23, 2009 4:18 am
by prometheuzz
shiznatix wrote:yessir thats what i needed. super thanks
Good to hear it, and you're welcome.