A Content Delivery Network (CDN) can greatly speed up the communication between your site and your visitors. Learn how they do that...and more!
It is dinner time, and you have a strong craving for pizza. Like many people, you mentally debate for a bit on whether you ought to eat something healthier. Can’t go wrong with a salad, right? Well...
If you are like me, you dismiss the desire for a healthy salad very quickly and start imagining eating the biggest pizza with the most number of toppings that could possibly fit without defying the laws of physics. In the end, you place your order for a pizza to be delivered as quickly as possible.
Let’s imagine that you live in an area that looks a bit as follows:
In ordering your pizza, there are two scenarios we want to discuss. One scenario involves ordering from a mom-and-pop shop that has a single location. The other scenario involves ordering from a chain that has multiple locations.
There is one particular place you like to order from. They make the pizza just the way you like it. You know the owners. They pretend to know you. Being a small shop, they just have a single location:
When you have just a single location, that one location will be responsible for delivering the pizza to you. As these things go, the closer you are to this location, the faster your delivery will be:
The further away you (or other customers) are, the longer the delivery will take from this single location:
If you live really far away but really REALLY want pizza from this location, there is no other option. You are going to have to wait while they deliver all the way to you:
With greater distance, there is also the risk of other factors that will affect how quickly your pizza gets delivered. There is traffic, construction slowdowns, ducks (or other animals with no sense of urgency) crossing a street, inclement weather, accidents, everyone slowing down to watch the aftermath of an accident, and a variety of other things that could happen along the way that will make your pizza less warm and less fresh when it ultimately gets delivered.
Instead of ordering from the single mom-and-pop shop, you realize you have a sweet coupon/deal from a mega pizza chain that you noticed while going through a stack of junk mail. Obviously, this deal is set to expire...tomorrow! Ordering from this chain makes the most financial sense. We can’t let the deal expire and go to waste! That would be unacceptable.
This chain has multiple locations in your area:
The advantage of multiple locations is that you can reasonably expect that the store closest to you will be the one who will make your pizza and deliver it:
Because there are multiple locations spread across your area, there won’t be any really long trips. This means locations like the office complex or stadium will get almost as speedy a service as you were able to without worrying about traffic and other related boondoggles we called out earlier:
The other advantage of multiple locations is you are less likely to be impacted by any unforeseen circumstances. A flood of orders suddenly came in to the closest store? No worries. You have other locations that can help balance the extra demand. If a power outage takes some locations offline, our friendly mega pizza chain can optimize delivery from a location that (while possibly further away) is still able to make the delivery:
Yes, they may even creatively decide to deliver your pizza using some unconventional ways like via the water. Anything is possible.
With our pizza scenarios in the previous section, there are a few things we can generalize on:
Now, the focus of this article isn’t about pizzas. It is just that looking at the various things that go into pizza delivery has a lot of overlap with how our web content makes its way around the world.
Let’s say that we have a web site that people from around the world tend to visit. Our content lives in a special computer known as a web server whose sole job it is to send our web site’s contents across the internet to any device that requests it. A web server must have a physical location, and our server is located in the western side of the United States. Placed against the vast expanse of our world, it would look a bit like this:
Now, some visitor on the eastern side of the United States wants to visit our web site:
The way data from our web server makes its way to this visitor’s device is a bit similar to how our earlier pizza delivery vehicle drives through town to make a delivery. There are a series of stops and turns involved. In the case of our pizza delivery vehicle, this is just your usual roads and traffic lights and intersections and so on. For the communication between our web server and the visitor, these stops and turns (referred as network hops) are the various network exchanges (think of many really big routers) placed in various geographic locations that are responsible for sending data around. If we had to visualize these hops, it could sorta look a bit like this:
Now, here is an all too familiar problem. Each network hop adds a slight amount of delay as part of getting the communication between our web server and the visitor’s device going. The distance between our web server and our visitor matters. The further away our web server is geographically from someone who requested our site’s content, the more network hops are needed to make the communication work. The greater the network hops, the greater the delay. The fewer the network hops, the...um...fewer the delay.
Because our web site is interesting to a worldwide audience, think of how many hops would be needed for something that spans oceans and continents (hint: it’s probably a boatload!):
The opposite holds true for shorter distances where our visitor happens to be close to where our web server is located and will therefore need fewer hops to make the communication work:
Now, while we have been focusing on distance our data needs to travel as a major contributor to how long it will take for something to load, there are other factors at play as well. The quality of the networks our data will travel through will vary. There may be network congestion in some part of the world that slows things down. An underwater cable that is responsible for carrying a lot of internet data might temporarily go offline from...sabotage by these:
But wait, there is more! The final point where our data is one hop away from reaching its destination (aka the last mile) might involve a slow cell phone connection or (gulp) a dial-up modem. Our web server may get overwhelmed by a lot of traffic. Some bad actors decide to perform a DDoS attack on our server. And so on. There are a billion more reasons we can come up with, but one conclusion is almost always the same. The greater the distance our data has to travel to reach its destination, outside of the general number of network hops needed, the more other things will come up that increase how long it will take for a visitor to access our web site’s content.
Why does all of this matter? Is this really a problem? Well...It isn’t as big a problem as what the world faces with Godzilla (🦖) destroying some coastal city every few years, but there are a bunch of studies on top of studies that show that web sites that load content quickly tend to attract more visitors who stay more engaged. More visitors and more engaged visitors are all good things. We want to ensure that these good things don’t just apply to visitors who happen to be living close to where our web servers are located. We want all of our visitors worldwide to enjoy fast and snappy web site load times:
How can we actually do that? The answer to this question is where this mysterious creature known as a CDN comes in.
The problem right now is that our web server, located in one part of the world, is responsible for serving visitors from all parts of the world. What if we had a way to replicate our web server to more parts of the world so more visitors are physically closer to it? What if we could do something like this?
In this approach, depending on where a visitor is located, the closest web server will be the one responsible for sending data to them. This avoids the problem of having to communicate across long distances and deal with all the delays associated with that. There is one wrinkle here. Replicating our web server itself multiple times has some technical issues. What if we could instead have a similar solution where we still have our single web server but have a way to give users the content they need from a more local location, thus avoiding the long communication times?
This is where a CDN comes in. A CDN, short for Content Delivery Network, acts as an intermediary between our web server (often referred to as an origin server) and our visitors spread out across the world. This arrangement can loosely be visualized as follows:
Notice the (lack of) interaction between our visitors all over the world and our web server located in the western United States. The only common interaction for all parties is the server that is part of the CDN, often referred to as an edge server and represented by a gear icon in the above image. This is an important detail that is core to how CDNs work to speed things up, so let’s spend a few more moments on this.
If we didn’t have a CDN setup to work with our web server, when a visitor tries to load our home page (index.html), this is what the path would look like:
Our domain (foo.com) would point directly to our web server, and our web server will be responsible for sending any requested resources to the visitor’s device. If the distance between our visitor and our web server is really large, the whole trip around these steps may take a long time.
With a CDN thrown into the mix, this is what the path will look like instead:
The main change is that any request to our web server is handled directly by the edge server that is part of our CDN. This edge server contains really fast storage (SSD drives, memory, etc.) that caches all of the static content from of our web server. In the case of our index.html example, if index.html is already cached by our edge server, it will return index.html to the visitor’s device. This will be super fast.
If index.html is not already cached, our edge server will first retrieve it from our web server and then pass it off to the visitor’s device (as seen by the new Step 3):
Retrieving this file from our web server will take additional time, but it will be a one-time cost. Subsequent requests will be super fast because index.html will now be returned from the edge server’s cache as opposed to requiring a revisit to our web server.
The biggest benefits a CDN provides are with static files like those that make up our HTML, CSS, JS, and Images. The CDN will cache these static files and periodically check with the web server to ensure the latest version of these static files are in the cache. For dynamic requests, such as those that involve you passing in some data, generating a file on the fly, or doing any other sort of thing where the output will vary, a CDN can’t help much. It's hard to cache something that will be unique for each visitor. If your site has a lot of dynamically generated content, then a CDN may not help you as much as it would if your site has a lot of static content.
The edge servers that are now the primary entry point for our web site are capable of doing more than just caching data. Depending on your CDN provider, these edge servers have the ability to do a lot more. Some of the additional functionality you get include:
There are some other niceties different CDN providers may have, but these four capture the biggest bucket of things you usually get for “free” when you make a CDN a part of your overall web site experience
Before wrapping this all up, I want to share a little bit about this site’s experience with using a CDN. Overall, it has been fantastic. This site gets a fair amount of traffic from around the world, and the average response time prior to using a CDN has always hovered around 500 milliseconds. That’s...not great, not terrible. By going with a CDN, I was able to reduce that number quite significantly:
We can visualize these numbers slightly differently. The web/origin server for this site is in Los Angeles, so we can see in the following image that the regions furthest away from it saw the greatest improvement (as highlighted by the intensity of the green colors):
My CDN provider is Cloudflare, one of the larger and well-known players in this space. I found getting started with them to be a breeze, and they provide a bunch of tools and services that I slowly ended up relying on to make this site even faster for visitors such as yourself. Having many of my screenshots and illustrations auto-converted to WebP on supported browsers often reduced a page's size greatly:
That 18% number of WebP images you see in this were entirely delivered by the CDN itself to only those browsers that supported WebP. For a site like this where an article can have several megabytes worth of images, this is a huge benefit.
I guess the big question for you to answer is whether you need a CDN for your site. The answer is, it depends. Using a CDN is not a magic bullet for your web site’s performance problems. It is one tool among many you should use to incrementally keep making your site faster. There is a lot you can do on the client-side layer in terms of how you build your page, how you optimize your scripts, and so on. There are is a lot you can tweak in your web server to improve performance as well. If you are curious, you can see some of the steps I took prior to investing in a CDN.
Now, if you are at the point where the next series of perf gains you will get can best come from investing in a CDN and you are willing to pay the cost for one, then you should jump in feet first with no hesitation. Out of the box, any CDN you go with will give you major improvement in the communication time your visitors worldwide will have with your site. On top of that, CDNs give additional benefits around minification and image optimization that you don’t have to configure a build step and worry about dealing with yourself. If you are looking to use a CDN and don’t know where to go, this site has been happily on Cloudflare for some time. You should look into using them, and NO the fine folks at Cloudflare haven’t paid me to say this...at least not yet :P