by
Miran Lipovaca aka foodpk | 7 April 2007In
this tutorial, I'll show you how to switch from long URLs to
short, clean URLs.
A basic understanding of how server-side languages work is
recommended in order to
follow through. Even though many websites are switching to
clean URLs, most
websites that feature dynamically generated content use long
URLs with query strings
in them, something like:
http://yoursite.com/index.php?cetegory=cooking&page=pasta§ion=4
On the surface, you may be wondering what the real
problem with that is. Sure, it doesn't look pretty, but does
that really matter? It turns out, there are a few problems
associated with such long, complicated URLs:
- They're not search engine friendly. Most search
engine crawlers stop crawling
when they come across query strings in the URL.
- Like mentioned earlier, they are hard to remember
and don't look very nice.
- They usually reveal what kind of technology you use
to display your site (like
PHP or ASP). This makes your site easier to compromise,
because it gives hackers insight into how it works.
Because of those three reasons (and others I'm sure!)
it's better to have concise, clean and descriptive URLs. At
the end of this tutorial, you will learn how to take those
huge URL's and make them cleaner. The clean URLs will look
like static URLs, but via mod rewriting, you still trick
them into getting data for generating dynamic content.
So with mod rewriting you can switch from URLs like:
http://yoursite.com/index.php?category=cooking&page=pasta§ion=4
To something like:
http://yoursite.com/cooking/pasta/4
But the best thing is, you'll still receive the GET
information from your query strings like you did before.
It's a win-win situation!
So how do we go about doing that? First, make sure
that you are allowed to set
per-directory .htaccess files. If you're not sure about
that, ask your host. Next, create a file called
.htaccess in the directory where your site is.
We'll
assume that's the root (/) directory. That file should
contain the following text:
Let's stop here for a bit and look at what the above
does. First there are a few
commands that set things up for us. They tell the server to
follow symlinks, turn
on the rewrite engine and set the rewrite base to
/.
If you have your site in one directory,
say /mysite/
but want it to be
accessed without that, like it was in the root, set the base
to /mysite/
Otherwise, just leave it at /
Also, if your index.php is not in the root folder (/) modify
the above rewrite
rules accordingly. So if your index.php is in
/mysite/, the
first rewrite rule
should have /mysite/index.php?category=$1 instead of
/index.php?category=$1
Then are the rewrite conditions. That is where we tell the
server that, if the URL
someone has tried to access is an actual file or a
directory, don't interpret it
via the rules but just return it to the client. That's
good because otherwise
we'd have some problems with links, images etc.
Phew! It's time to take a breather. In the
next page,
let's pick up from where we left off and discuss the rewrite
rule itself.
Onwards to the
next page!
|