PDA

View Full Version : URL control with PHP/Apache


pureconcepts
8-9-06, 10:39 AM
I am looking to update my navigation for a website and wanted to have cleaner, more friendly URLs. Not sure the technical name for what I am after, but here is the deal.

I want to change my exiting URLs as follows:
http://mydomain.com/contact.php => http://mydomain.com/contact/

Of course I want the latter address to resolve the contact.php.

In my attempts to solve this myself, I can across several variants to htaccess files, including mod_rewrite. However, most of them dumped requests to a single file (such as index.php) that did further resolving.

I want something more intuitive as I may have:
http://mydomain.com/about/policy.php => http://mydomain.com/about/policy/

Were about is an actual directory not just the about.php file.

I am sure this could be done, but I am new to Apache settings. Concept wise, I want something that forces PHP parsing, for those URLs without extensions. Yet, is still intuitive enough to handle both files and directories correctly. Of course, handling GET_VAR, adding a trailing slash, etc need to be considered.

I know that is a big request, but anyone that can provide some links or sample code would be appreciated.

Oh and of course, can this be done on the NEW platform?

snowmaker
8-9-06, 10:47 AM
Could this done by making a directory named contact, putting contact.php in that directory, then renaming it index.php? Any other links to the file contact.php would have to be changed accordingly of course, so this could turn into alot of work.

joshuamc
8-9-06, 12:35 PM
I know that you can rewrite the request to include the .php, but logically, I do not see how you would check to see if the original request is a folder or a php.

In order words, if a rule is to be set, how would apache determine whether /about was actually /about the directory or /about.php?

cc1030
8-9-06, 02:26 PM
Let me ask a simple question: would you always be using at most one level of parsing? In other words, will the last "word" in the URL always be the actual php file? For example
http://mydomain.com/about/policy/
will change to
http://mydomain.com/about/policy.php

Is that it? Just change the last one? You don't want to do something like change "mydomain.com/about/policy/abc/123" to "mydomain.com/about/policy.php?abc=123", do you? I don't know how to do that. But to change "/about/policy/" to "/about/policy.php" should be simple enough. I dont' know what you mean by the following, though:

Concept wise, I want something that forces PHP parsing, for those URLs without extensions. Yet, is still intuitive enough to handle both files and directories correctly. Of course, handling GET_VAR, adding a trailing slash, etc need to be considered.

What do you mean "forces PHP parsing"? What do you mean "intuitive enough to handle both files and directories"? And what do you mean by "handling GET_VAR, adding a trailing slash, etc need to be considered"?

pureconcepts
8-9-06, 02:35 PM
Thanks for the input so far. My request may be larger than what can actually be done, but I doubt it.

Put a little less wordy, I want to have a request for http://mydomain.com/contact to resolve contact.php. However, I don't want my solution to restrict me where a request for http://mydomain.com/about/policy does not look in the about/ directory for policy.php. I have not thought about the GET_VARS that much, but I believe I can handle that on a page level.

Creating various directories and placing an index.php file in them is not really a solution and really defeats the point of this whole idea.

cc1030
8-9-06, 02:44 PM
So basically you want the following:

Request: http://mydomain.com/contact
Redirect to: http://mydomain.com/contact.php

Request: http://mydomain.com/contact/
Redirect to: http://mydomain.com/contact.php

Request: http://mydomain.com/about/policy
Redirect to: http://mydomain.com/about/policy.php

Request: http://mydomain.com/dir1/dir2/script
Redirect to: http://mydomain.com/dir/dir2/script.php

Is that correct? Also, how do you want it to be handled if the request matches an actual directory? For example, if a request comes in for http://mydomain.com/contact, and there is both a contact/ directory and a contact.php? Which would should be shown?

Lastly, Why are you even mentioning GET_VARS, and what are they? Do you mean the $_GET (or the deprecated HTTP_GET_VARS) pre-defined array? If so, then what would change in that that you would be concerned about?

Once you answer these questions, I should have handle on what you want and probably get you some sample .htaccess rules to use.

extras
8-9-06, 02:58 PM
On normal Apache, something like this can be done easily.
I don't know if this would work on the new platform because of its setup, though.

When the request doesn't have any extension (in other words, no dot in the last part),
if a file exists for given URL + .php, serve that file.
Otherwise, do nothing and let the server do as usual.

RewriteEngine On
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI}.php -f
RewriteRule (.*/)?[^.]+ %{REQUEST_URI}.php [L]

Not tested.

pureconcepts
8-9-06, 09:32 PM
Thanks again guys very helpful.

I knew apache could handle this with a few rules in htacces, just never messed with mod_rewrite. Anyways, to answer your questions cc1030:

- Your examples are correct.
- Priority on directories, although I plan not to have such a conflict
- GET VARS and $_GET are the same, sorry old habit. My concern comes with what will happen when these are added to the request URL. For example: If had wanted http://mydomain.com/about/policy/2005 to resolve as policy.php?2005.

That may be above and beyond, although I am sure it can be done. For now though, my goal is the above. Thanks for any insight on these rules and I am wanting to learn them and I can evolve from there. Again any links or samples are much appreciated.

pureconcepts
8-27-06, 02:47 PM
I have finally gotten around to messing with this. I am going to try and stay simple, then build from there. However, I have run into problems. I organized most of my files into directories. In addition, I wanted to keep these "directory style" links even for the files. I explained this in my post aboves. Nonetheless, the goal the goal is to have

URL:
/about/policies => /about/policies.php

Where "about" is a directory and "policies" is the "file policies.php". I wanted this to happen in the background, i.e. not change the address or redirect. I tried the following in htaccess but get error 500.

<Location /about>
ForceType application/x-httpd-php
</Location>

Any help would be great.

cc1030
8-27-06, 03:39 PM
The <Location> directive is not allowed in an .htaccess file; it's only allowed in the server configuration file (httpd.conf), so you can't use it. It wouldn't help for what you want to do anyway. Try the following:

RewriteEngine On

RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI} -d
RewriteRule ^(.*)$ - [L]

RewriteCond %{DOCUMENT_ROOT}/$1.php -f
RewriteRule ^((.*/)*[^./]+)/*$ $1.php [L]

That will check to see if the request matches an actual directory; if it does, then the directory's index page will load. If the request does not match a directory, Apache will remove any trailing slashes from the request and add .php and see if the file exists; if it does, it will process that file. For example, if you had an /about/contact/ directory and an /about/contact.php page, and a visitor requested "http://example.com/about/contact" or "http://example.com/about/contact/", the above code would load the /about/contact/ directory. Otherwise it would load the /about/contact.php page.

pureconcepts
8-27-06, 04:48 PM
Thanks, that looks good. I plan to aviod the conflict where I have a file and directory of the same name.

Curious about two things so I understand. First, will a request for /about/policies resolve the file policies.php, provided there is an about directory with a policies.php file within. Second, will the URL in the address bar maintain the address /about/policies even though I am using these rewrite rules. Furthermore, do you know if $_SERVER['REQUEST_URI'] will hold /about/policies.

I would want it set up so that all the answers to these questions are yes. Thanks cc1030

cc1030
8-27-06, 04:58 PM
Put simply, yes, policies.php would be resolved, and /about/policies would remain in the address bar.

Using the above code, the process works as follows:

The visitor requests either of these pages:
http://example.com/about/policies
http://example.com/about/policies/

(if you have code in your .htaccess file for the missing trailing-slash fix, then Apache would internally rewrite the first request (without the trailing slash) to the second request (with the trailing slash)).

The first rule tells Apache to checks to see if the request is a valid directory. So if you actually had a directory /about/policies/ in your /htdocs directory, then the index file from /htdocs/about/policies/ would be displayed. If the request does not match a directory, then the first rule is not matched, and Apache proceeds to the second rule.

The second rule tells Apache to strip the trailing slashes off and add .php, and see if the file exists (in this case, it will check for "/htdocs/about/policies.php"). If that file exists, then it is displayed. If it does not exist, then the rule is not matched, and Apache continues reading through your .htaccess file (and will probably result in a 404 Not Found error because the request isn't for a valid file).

In neither of the above cases does the address bar change. These are internal rewrites only, so the visitor doesn't see any of this.

cc1030
8-27-06, 05:00 PM
Oh sure, edit your post after I start my reply :)

To answer your third question, yes, $_SERVER['REQUEST_URI'] would show "/about/policy", the request as it came in from the visitor (before the internal rewrites). At least that's what my tests showed.

pureconcepts
8-27-06, 09:11 PM
Haha, sorry had some more last thoughts. Just want to make sure I understand the process as it is new to me. A few more for you.

Will everything be okay with the Query String ($_GET) for these rules. If I go to about/policies?id=1209 its not going to look for policies?id=1029.php will it?

Also, do I need the .htacess trailing slash fix? If so, where can I get that code.

cc1030
8-27-06, 09:23 PM
Haha, sorry had some more last thoughts. Just want to make sure I understand the process as it is new to me. A few more for you.

Will everything be okay with the Query String ($_GET) for these rules. If I go to about/policies?id=1209 its not going to look for policies?id=1029.php will it?

Yes, that's exactly what it will do. I'm sure something can be done with the QSA flag or by modifying the rewriterule, but I'll have to do testing to find the exact rule.

=pureconcepts]Also, do I need the .htacess trailing slash fix? If so, where can I get that code.
# Fix for the missing-trailing-slash for directories
# on htaccess-redirected domains/subdomains.
RewriteCond s%{HTTPS} ^((s)on|s.*)$ [NC]
RewriteRule ^/*(.+/)?([^.]*[^/])$ http%2://%{HTTP_HOST}/$1$2/ [L,R=301]

cc1030
8-28-06, 12:31 AM
Will everything be okay with the Query String ($_GET) for these rules. If I go to about/policies?id=1209 its not going to look for policies?id=1029.php will it?

Looks like I was wrong about the query string. I just tested it on my own site, and it works properly WHEN USED WITHOUT THE MISSING-TRAILING-SLASH CODE.

Browser request: http://cmctestsite.com/formtest/test?abc=xyz&a1=blah&a2=two&a3=a%20b%20c
PHP_SELF: /formtest/test.php
QUERY_STRING: abc=xyz&a1=blah&a2=two&a3=a%20b%20c

When the missing-trailing-slash code is used (which, by the way, needs to be the first RewriteRule), it messes up the %-encoded parameters. For example:

Browser request: http://cmctestsite.com/formtest/test?abc=xyz&a1=blah&a2=two&a3=abc
is rewritten as: http://cmctestsite.com/formtest/test/?abc=xyz&a1=blah&a2=two&a3=abc

Looks good, right? But...

Browser request: http://cmctestsite.com/formtest/test?abc=xyz&a1=blah&a2=two&a3=a%20b%20c
is rewritten as: http://cmctestsite.com/formtest/test/?abc=xyz&a1=blah&a2=two&a3=a%2520b%2520c

For some reason, the missing-trailing-slash code changed the %-encoded spaces (%20) into %25%20. That first one might not be right, either. I just noticed that in the second example, the & signs were changed to "&amp;" in the location header. I'll have to look into this when I have more time.

produke
10-1-06, 09:50 PM
There are a couple simple ways to do this that I use.


Redirecting http://www.askapache.com/contact/ to http://www.askapache.com/contact.php


Create a directory /contact/ and in the htaccess for that directory have this.

DirectoryIndex http://www.askapache.com/contact.php


Better yet, move your contact.php script into the /contact/ directory, and rename it index.html, then use this htaccess code in the /contact/ directory.


ForceType application/x-httpd-php



Or if you have your heart set on rewrites, you could put the contact.php in the cgi-bin directory and reference any calls for /contact/ to it. Remove /cgi-bin if you just want /contact.php.

I put all my php scripts in the cgi-bin directory.



RewriteRule ^contact/$ /cgi-bin/contact.php [T=application/x-httpd-php]

TonyG
10-2-06, 06:34 AM
PMJI: Purely as an obsever I look at the contortions that one needs to go through to accomplish what seems to be a simple task. Wouldn't it be easier to have a PHP or Perl script parse out the URL, take whatever action is required on the querystring, path, etc, and redirect to some calculated result? My god people, we're working with computers here that are "supposed" to make our lives simpler, and here everyone is perfectly accepting of cryptic regular exceptions and esotheric syntax of .htaccess. Take a step back and look at this tripe. It pains me to see so many intelligent people so blinded that this sort of thing is accepted as perfectly normal, acceptable, even necessary to accomplish these tasks. This is why we don't write programs in machine code - we use higher level languages which then compile down to something the machine will understand, and then we build application programs so that we can do data entry without writing a program for everything. Hasn't anyone written a program to accept a set of human-readable rules and then generate .htaccess files?

While writing my rant and hunting for such tools I stumbled across this very informative web page on .htaccess (http://brainstormsandraves.com/archives/2005/10/09/htaccess/) and rewrites. HTH