Page 1 of 2

How do I extract the URL after the domain name?

Posted: Fri Jan 15, 2016 11:07 am
by simonmlewis
I have a URL such as this: http://www.thisurl.co.uk/fred/bloggs/123
Or perhaps: http://www.thisurl.co.uk/fred-bloggs

What I want to do, is to get the characters from the first / after "uk" and onwards.
So $url would = "/fred-bloggs", or in the upper version "/fred/bloggs/123".
$url = 'http://www.vimeo.com/1234567';
$str = substr(strrchr($url, '/'), 1);
echo $str; // Output: 1234567
I found this, but it''s not what I want, as I think that goes from the right, and counts inward for the / characters. For me, that would changed at least three times for my purchase. So I need it to get that "third" slash from the left, but include it in the query result.

Possible?

Re: How do I extract the URL after the domain name?

Posted: Fri Jan 15, 2016 11:34 am
by Celauran
You could do this a few different ways. Is the domain yours (ie. the same one running the code) or external? If it's yours, you could lean on either $_SERVER or a config setting. Alternately, you could use regex like so: http://rubular.com/r/1L5VU4VMQ6

Re: How do I extract the URL after the domain name?

Posted: Fri Jan 15, 2016 12:27 pm
by simonmlewis
It's our URL. I was trying with $_SERVER, but that picks up the whole thing.
I Assume I could just say "pick up everything after character number...x".
Character X being "k". Or better yet, say to pick up the third / and onwards.

Re: How do I extract the URL after the domain name?

Posted: Fri Jan 15, 2016 12:35 pm
by Celauran
simonmlewis wrote:It's our URL. I was trying with $_SERVER, but that picks up the whole thing.
What were you trying? $_SERVER['REQUEST_URI'] should be exactly what you're looking for.

Re: How do I extract the URL after the domain name?

Posted: Fri Jan 15, 2016 3:06 pm
by simonmlewis
OH yes, sorry. Thanks a mill.

Re: How do I extract the URL after the domain name?

Posted: Sun Jan 17, 2016 1:11 am
by Weirdan
And for arbitrary url you could've used parse_url():

Code: Select all

$parsed = parse_url('http://www.thisurl.co.uk/fred/bloggs/123');
var_dump($parsed['path'] . '?' . $parsed['query'] . '#' . $parsed['fragment']); // $parsed['path'] may be enough, depending on your requirements

Re: How do I extract the URL after the domain name?

Posted: Mon Jan 18, 2016 3:06 am
by simonmlewis
The purpose of this is to replace old urls for new ones, as we are moving a site to a new system.
We want to do manual 301s.

Problem now is that if the Old URL is not within the htaccess structure, then it's throwing an "Object Not Found" in Firefox.
If the Old URL is a correct style, ie, categ/number/word, then it works.

I want to be able to catch them all. Do I need to put in something in htaccess that basically allows anything, and then let the 301s take over?

ie:
/long-shirt/115
becomes
/long-shirt

This style isn't in htaccess, as we ask for /categ/word.

Re: How do I extract the URL after the domain name?

Posted: Mon Jan 18, 2016 6:25 am
by Celauran
How are you handling routes in the new system? Is it still a bunch of .htaccess rules or are you handling it in the code itself, say via a front controller?

Re: How do I extract the URL after the domain name?

Posted: Mon Jan 18, 2016 6:32 am
by simonmlewis
HTACCESS has the structure of all the URLs.
So if a Structure is categ/number/name
But the requested URL is /name/number, then it falls down.

I have however figured out how to accept *.aspx, and then reroute that to /word.
That then allows my to do manual 301s, to point privacy.aspx to /privacy.

It's the longer URLs that I have an issue with.

Another example - we have lots of Not Founds for URLs like this:
/categ/115/long-sleeved-shirt//xl

It's likely that at some stage this url with the invalid "//" has been picked up, but now I cannot seem to do 301s to point that to the correct URL, because // is not valid in my structure.

Re: How do I extract the URL after the domain name?

Posted: Mon Jan 18, 2016 6:48 am
by Celauran
That feels like a good argument for moving toward a front controller pattern and some proper routing. That said, are you able to determine from these partial URIs what the correct route should be? If so, you could let unknown routes fall through to index.php, attempt to work out the route there, and either fire a redirect or a 404

Re: How do I extract the URL after the domain name?

Posted: Mon Jan 18, 2016 7:36 am
by simonmlewis
That's kind of what I am trying to do.
I Want to stick with our HTACCESS because that's how it's all written. But I want URLs like the real oddities to pass through to a *.php file that can check if that URL is in the database, as a recognised "issued URL", and then route it to a better place that is relevant to the original URL. If that makes sense.

Re: How do I extract the URL after the domain name?

Posted: Mon Jan 18, 2016 7:48 am
by Celauran
The trouble there is what criteria to base it on. It's difficult for me to say without really knowing the structure of your project or where things are being redirected. Sending all requests through the front controller and then parsing your routes would make that relatively easy to implement. What does your .htaccess look like? I'm sure we can figure something out.

Re: How do I extract the URL after the domain name?

Posted: Mon Jan 18, 2016 7:51 am
by simonmlewis
At the moment, it is this:

Code: Select all

DirectoryIndex  index.php index.html index.htm
order allow,deny
allow from all 

RewriteEngine On
Options +FollowSymLinks
Options +Indexes
RewriteRule ^(blog)($|/) - [QSA]

RewriteRule ^categ/([0-9]+)/([^/]+) /index.php?page=categ&c=$1&cname=$2 [QSA]

RewriteRule ^categ/([^/]+) /index.php?page=categ&cname=$1 [QSA]

RewriteRule ^categ/page/([0-9]+)/([^/]+)/([0-9]+) /index.php?page=categ&c=$1&cname=$2&pagenum=$3 [QSA]

RewriteRule ^product/([^/]+)/([0-9]+)/([^/]+) /index.php?page=product&cname=$1&product=$2&h=$3 [QSA]
RewriteRule ^brands/([^/]+) /index.php?page=brands&brand=$1 [QSA]
RewriteRule ^type-of-product/([^/]+)/page/([0-9]+) /index.php?page=type-of-product&producttype=$1&pagenum=$2 [L]
RewriteRule ^type-of-product/([^/]+) /index.php?page=type-of-product&producttype=$1 [QSA]

RewriteRule ^([^/\.]+)/?$ index.php?page=$1 [QSA]
RewriteRule ^([^/\.]+)\.aspx$ index.php?page=$1 [QSA]

RewriteRule ^$ index.php?page=home [QSA]
The aspx section, 2nd from bottom, I figured out myself from a bit of research as at the moment their web site uses *.aspx for many pages. But without that, I cannot say :

/fred.aspx 301 to /categ/t-shirts

Re: How do I extract the URL after the domain name?

Posted: Mon Jan 18, 2016 8:16 am
by Celauran
Looks like most of those should probably have [L] at the end. Given that everything points to index.php anyhow, you could possibly add a catch all route like

[text]RewriteRule . index.php?checkroute=true [QSA,L][/text]

If $_GET['checkroute'] is set, run the $_SERVER['REQUEST_URI'] through some detection script to see if you can determine what the correct route is, then redirect or 404 depending on the outcome.

Re: How do I extract the URL after the domain name?

Posted: Mon Jan 18, 2016 8:21 am
by Celauran
I'm also going to leave this here: https://youtu.be/z9Bg9lSTUQM. Well worth the watch, especially since it seems like your project and the one he's discussing are quite similar.