Table of Contents
Let’s talk about the URL! Short for Uniform Resource Locator, the URL is a bunch of letters and weird characters that represent an address to something on the internet. All of us who do anything on the web run into URLs all the time. We can't avoid them:
Now, for those of us as developers, URLs take on an extra layer of importance. We will periodically find ourselves either needing to read a URL or create a URL to make our web sites and apps work. This means we need to go into greater detail on what URLs are and how we can work with them using JavaScript. That's where this article comes in!
Onwards!
The majority of URLs we will encounter will look as follows:
This diagram shows an example of what a typical URL looks like, and it also calls out the interesting parts of this URL that we'll need to familiarize ourselves with. Let's quickly walk through what those parts are:
Now, there is an important reason we looked at the various parts of a URL, and it goes beyond just having an edge during trivia night. We are going to look into how to identify these parts of a URL using JavaScript, and we'll be doing that by relying on the conveniently named URL object:
let myURL = new URL("https://www.kirupa.com/learn/index.htm");
In its simplest form, we can create a URL object by calling its constructor and passing in the URL that we want to parse. The URL can be one we manually define, or it can be one we get from the currently loaded document in the browser by using window.location.href:
let myURL = new URL(window.location.href);
Once we have our URL object defined, there are a handful of properties that make accessing the various parts of a URL possible:
let myURL = new URL("https://www.kirupa.com/learn/index.htm");
let href = myURL.href; //--> https://www.kirupa.com/learn/index.htm
let origin = myURL.origin; //--> https://www.kirupa.com
let protocol = myURL.protocol; //--> https:
let path = myURL.pathname; //--> /learn/index.htm
Thanks to the URL object, getting the full URL, origin, protocol, and path values are a breeze! This doesn't mean that all is well here.
What we don't have easy access to are ways of figuring out the TLD and the subdomains. This is a tricky problem since TLDs can appear in many formats, and the list of TLDs also increases regularly with new and custom additions. This TLD variation influences how we detect subdomains as well, for any string matching technique we come up with will require us accounting for a lot of edge cases.
Some examples of common edge cases that a solution should accommodate are the following completely valid URLs:
For a starting point on solving this, we can rely on a third party library such as node-tld or build our own mapping using the reasonably up-to-date Public Suffix List. I wish there were an easier way!
As our web sites and apps get more complex, the URLs that they either map to or generate have also gotten more complex. Here is the thing. URLs aren't just for helping our browsers find a destination on the internet. They also play a role in helping store application state. This is a feature that many modern web sites and apps take full advantage of. Take a look at the following example where we have a more complicated (yet totally valid) URL:
Just like before, let's walk through what these parts refer to:
The URL object provides some handy (or almost handy) ways for accessing these URL parts as well:
let complexURL = new URL("https://www.kirupa.com:80/index.htm?foo=hello&bar=welcome#h1");
let host = complexURL.host; //--> www.kirupa.com:80
let hostname = complexURL.hostname; //--> www.kirupa.com
let port = complexURL.port; //--> 80
let search = complexURL.search; //--> ?foo=hello&bar=welcome
let searchParams = complexURL.searchParams;
searchParams.get("foo") //--> hello
searchParams.get("bar") //--> welcome
let hash = complexURL.hash; //--> #h1
There are a few more things we can add to our URL, but they aren't very common. For example, URLs support having a username and password defined just before the hostname, but such a scheme is rare and highly discouraged these days.
We spent a lot of time looking at how to read the various parts from our URL. It's time to turn the tables around. We are now going to look at creating our own URL. This is far less scary than it sounds. The main detail to note is that a URL is just a string as we have seen in our examples so far:
let myURL = new URL("https://www.kirupa.com/learn/index.htm");
let complexURL = new URL("https://www.kirupa.com:80/index.htm?foo=hello&bar=welcome#h1");
Common string manipulation operations we are already familiar with work really well here. The following is an example of us using the string literal syntax to make substituting URL values simple:
function returnDomain(countryTLD) {
return `https://www.google${countryTLD}/`;
}
let uk_site = returnDomain(".co.uk");
console.log(uk_site); //--> https://www.google.co.uk
This doesn't mean we are fully on our own, though. Where we do get some extra assistance from the URL object when we need to specify our search parameters. Because the parameters are key/value pairs and can get a bit unwieldy when we are working with a lot of them, we have the set method on searchParams that we can use to simplify how they get defined as a part of our URL:
let tempURL = new URL("https://www.kirupa.com/");
tempURL.searchParams.set("name", "Mario");
tempURL.searchParams.set("location", "Yoshi's Island");
tempURL.href; //--> https://www.kirupa.com/?name=Mario&location=Yoshi%27s+Island
Notice that our generated URL has all of the search parameters properly specified with special characters substituted properly. While we can certainly specify these parameters manually by escaping the various special characters, the URL object does give us this handy shortcut that can save us some time here.
After all of this, once we have our fully constructed URL, our options here are wide open on what to do next. We can tell our browser to navigate to this URL. If our created URL defines an API endpoint, we can use fetch to make a web request. We can log this URL to local storage or cookie. There is a lot we can do.
URLs represent the link (literally) that makes all of the connections on the internet work. Without URLs, we won't have a way to navigate between pages, know where a particular piece of content lives, and be unable to perform other common tasks. In this article, we went one level deeper and really looked at the various parts that make up a URL and the JavaScript properties that we have for working with them.
Just a final word before we wrap up. If you have a question and/or want to be part of a friendly, collaborative community of over 220k other developers like yourself, post on the forums for a quick response!
:: Copyright KIRUPA 2024 //--