CDN: when and how to use one?

CDN : when and how to use one?

Using a CDN (Content Delivery Network) is highly recommended when we want to improve the performance of a website, especially if we have many visits from several locations. But depending on how the website is programmed and the type of information it shows, we can take more or less advantage of the characteristics of this type of service.

What is a CDN?

Technically, CDNs are very powerful reverse proxies networks, connected directly to the Internet core routers (we already explained what a reverse proxy was here ). In other words, they are very fast servers, in which the contents requested by users are cached and, being connected directly to the core or very close to the main Internet routers (what is called the first layer or Tier-1) , they can respond faster than most servers, where requests usually have to jump through a larger number of less powerful routers. In addition, this also allows them to use technologyAnycast consisting of using a single IP for all the proxy servers of the CDN so that when a request arrives, it responds closest to the user’s IP.

Let’s take a look, roughly, of how a CDN works:

In the example, we see 7 clients that simultaneously request the same file from the web, but only 3 of them arrive at the source server. This occurs because the first request that arrives at each node causes the response of the origin server to be stored in the same cache, so that the next request from another client does not have to travel to the origin, but the node responds directly from this content cached.

In geographical location 2, we will not obtain any improvement, since we have only one client that has made the request, in the same way, that we would not obtain improvement for clients who request different files, for the first time, to the same node.

What is a CDN for?

The CDN is useful for:

Make requests cached in these be returned faster, increasing the speed of the site.
Bring the service closer to the user’s location, so that not only the files are returned before because they can be on a more powerful server or with the HTML code already generated, but the latency is less, having to travel less distance and, consequently, fewer jumps between routers.
Free of load a saturated server maintaining a stable number of requests.
Mitigate attacks: it is one of the best ways to avoid distributed denial of service ( DDoS ) attacks: this involves launching many requests to our website from several locations to try to block the server.

When users visit us from a single location it is not necessary to hire a CDN with many nodes, since with a single reverse proxy with Varnish or similar, or a single node of the CDN, we can get all the previous features, including bringing the user service, if the source server is not in the same country.

When is the cache of a CDN not used?

Ideally, the CDN will allow us to cache all the files on the web, which we will divide into static files (that is: images, CSS, JavaScript and fonts) and dynamic files (HTML). The latter is the most important for performance. Ideally, the CDN should cache them whenever possible (even if we already have an HTML cache on the source server). There are situations, which we explain below, in which we will not be able to cache all the HTML in the CDN nodes :

High refresh rate: if our website shows dynamic content that is updated every second, we will not be able to use a CDN to cache HTML files, since, with a very high refresh rate, cache failures will occur constantly. If, on the contrary, the contents are updated every certain number of hours or days, we can use this. There may also be the option that the CDN allows us to establish rules with the frequency of updating each type of URL.
Many locations with few users: if we have locations with few users and the HTML cache is regenerated very frequently, the use of this cache can be counterproductive, because it is possible that two users never ask for the same page that another one has already requested from the same location, as we see in location 2 of the graph above. As always or almost always, a cache failure will occur, which causes the request to be returned from the source server to the CDN and finally to the user, taking longer than if we did not have CDN and the request went directly to the server originally.
Adaptive web: another situation in which we may not be able to cache the HTML, is when we have an adaptive web, that is, the content changes based on the user-agent chain of the user ‘s browser. This forces the CDN to cache a different page for each user-agent chain, increasing cache failures and the cost of the service.
Online store or private user area: when a website has to show different content depending on the user who has logged in, we will not be able to cache this content in the CDN, so all requests that reach the CDN with the login cookie user or cart must be returned to the source server. If the CDN returned the cached content from one user to another, it could see the data of another. We could cache the rest of the pages normally. Here, if there is a shopping cart, it is recommended that it be loaded by AJA because otherwise, you will not be able to cache any HTML from the web. That the CDN service has to perform this type of action may require the hiring of an advanced plan.
POST calls that store information in the database: when a user submits their data when filling out a form, this data must reach the source server to be stored in the web database, so the CDN must provide the means for this to be so and the lead is not lost on the CDN node.

In which cases is the cache of a CDN more used?

If we hire a CDN, we will always be able to cache all static resources. However, depending on the case, we may not be able to cache all HTML, as we have seen in the previous point. We can only do it when we have many visits, a low refresh rate and contents that do not depend on the user agent, or that the user has logged in. If these conditions exist and our target audience is in several countries, we can be sure that a CRC is an ideal solution. If any of these factors is not ideal, it may also benefit us, but we will have to consider each problem.

In addition, to take advantage of a CDN, in the implementation, different subdomains should not be used for each type of resource loaded. This technique called domain sharding has become obsolete with HTTP / 2 , so it is ideal that all resources load from the same domain as the web.

How to choose a CDN?

We have to choose the CDN by asking the following questions, many of which must be answered by the web development team or the systems administration team and can affect the price of the contracted plan :

Do you have nodes in the countries where my target audience is? We can see the nodes of the four main CDN services on this map. In this case, if the CDN does not have nodes where the service is offered, we could even be moving the service away from the users, if the source server is in the same country as these.
Do I have enough traffic to take advantage of the CDN cache? If we have little traffic, it will be enough to cache the HTML on the web server itself.
Will the frequency of updating my content allow me to cache the HTML? We can probably opt to hire a cheaper plan if we cannot make use of this type of optimization.
Do I need to discriminate traffic by user-agent? We must see if the CDN allows it or if the programming can be modified to avoid adaptive content. The cost, in this case, may increase considerably.
Do I need the CDN to detect certain cookies? We have to know if the plan that we are going to contract with the CDN, will allow us to send the traffic to the origin with a user who has logged in.
Do I need to exclude certain pages? The URLs of administration or form submission areas should be excluded from the HTML cache.
Does my page have file types that the CDN does not allow to use? The CDN may not implement some MIME types, so we cannot return some type of file needed for the web.
What additional features do you have? For example: HTTP / 2, HTTP / 2 Server Push , compress images, compress static resources with Brotli q11, apply optimizations to the web code, API to force the cache to be erased from the web administration area, that allows you to configure the final cache headers, let the user navigates from the cache if the source falls, security filters, etc.

conclusion

A CDN is very useful to accelerate performance in consolidated projects where we have a lot of traffic, especially if we offer the services or products of the web in several countries. But not only must the business needs be taken into account, we also have to take into account the technical needs of the site, in order to choose the most appropriate option. For this, it is necessary to have the web developers and ask any questions that may arise to the support of the CDN.