Page 2 of 2
Re: How do I extract the URL after the domain name?
Posted: Mon Jan 18, 2016 8:34 am
by simonmlewis
I'm not sure what you mean by checkroute.
I think either I am looking at this too simply, or what I want to do isn't possible.
/me/2/you//and-again
Let's say this is a URL that someone has used to link to our web site.
That URL doesn't exist. The structure of it, with two // rather than /somethinghere/, is bad. So it throws a fit.
I want in the PHP file to say:
/me/2/you//and-again
Is now going to:
/me/2/you/and-again
Simple as that. Copy the top line in OLD and the bottom line in NEW. So when someone goes to old, it's 301d to the new.
I've written the script and db table to run it. It works for many sides of this, but not for URL structures that are not on the system. So is it possible at the end of HTACCESS (or the start) to add a link that allows ANY structure in there.
Then in my module, I can handle that bad URL and take them somewhere nicer.
Re: How do I extract the URL after the domain name?
Posted: Mon Jan 18, 2016 10:41 am
by Celauran
I understand what you're trying to do, though I thought the // indicated an expected parameter was missing rather than just an extra / having been inserted. Not super important in any case. All I was proposing was a catch all route to which you could append a flag letting you know the rule failed to match any of your defined patterns so that you could transform it however you needed to and send back a redirect. In short
Code: Select all
if (isset($_GET['checkroute'])) {
$route = some_function_here();
if ($route) {
// redirect to new route
} else {
// trigger 404
}
}
That's obviously pseudocode, but it should give a good idea of what I was suggesting.
Re: How do I extract the URL after the domain name?
Posted: Mon Jan 18, 2016 10:49 am
by simonmlewis
I wish I could decipher what you are trying to do here. Sorry.
The // is actually a bad URL I have seen on some of our sites, where Google has cached a URL with missing data. ie. a missing ID number.
We would like to be able to catch those, and do 301, if the URL it's attempted to get to, still exists.
It needs to be handled from the database of old and new urls. But as I say, htaccess stops it working when the URL is not a proper structure we dictate. So I'm not sure how your code would help there.
Re: How do I extract the URL after the domain name?
Posted: Mon Jan 18, 2016 10:54 am
by Celauran
The problem you're having is that no .htaccess rule matches these bad URLs, yes?
Re: How do I extract the URL after the domain name?
Posted: Mon Jan 18, 2016 10:59 am
by simonmlewis
Correct. Because htaccess is expecting variables inside certain // entries. Or URLs being requested are nothing at all like the structure in HTACCESS.
So my original question was: can I just pop in a line in htaccess that basically allows all, points stuff that isn't listed, to index.php for example, and my 301.php file manages it from there.
Re: How do I extract the URL after the domain name?
Posted: Mon Jan 18, 2016 11:00 am
by Celauran
simonmlewis wrote:So my original question was: can I just pop in a line in htaccess that basically allows all, points stuff that isn't listed, to index.php for example, and my 301.php file manages it from there.
That is precisely what I had outlined above. Since everything ultimately gets sent to index.php, the query string was simply to let you know whether or not to trigger your lookup.
Re: How do I extract the URL after the domain name?
Posted: Mon Jan 18, 2016 11:09 am
by simonmlewis
I clearly don't know PHP as well as you. But your code is code to pop into index.php isn't it?
As you cannot put code like that in htaccess. So are you saying your code bypasses htaccess, and then forwards it to htaccess if needed?
Re: How do I extract the URL after the domain name?
Posted: Mon Jan 18, 2016 11:16 am
by Celauran
Not really. All incoming requests will go through .htaccess. These bad URLs won't match a rewrite rule, so we catch them with a catch-all route, add a flag so we know they're bad, and forward them along to index.php. Inside index.php, you can check for the existence of that flag. If it's set, you know we're dealing with a bad URL, so you call whatever functions you have set up to generate the correct URL. You then call a redirect to this correct URL. This new request will again go through .htaccess but will match a rule this time and everything proceeds normally from there. Make sense?
Re: How do I extract the URL after the domain name?
Posted: Mon Jan 18, 2016 11:22 am
by simonmlewis
Yes that does make sense.
You have to spot if it's a bad URL, and forward that warning to index.php.
Then you handle that bad URL. I can do that in the code I have written.
That then does a 301 to a new URL (which it will check first in htaccess), and we are on our way.
So - what do I pop into htaccess to do the catch-all... which then goes into the PHP script?
Re: How do I extract the URL after the domain name?
Posted: Mon Jan 18, 2016 11:27 am
by Celauran
viewtopic.php?f=1&t=142182#p702956
Drop that at the end and check for the query string.