Drupal 8: Quick Handbook on Cache API

Blog Tags: Drupal 8, Drupal Cache

 

Introduction to caching

In computing, a cache is a hardware or software component that stores data, so future requests for that data can be served faster. The data stored in a cache might be the result of an earlier computation, or the duplicate of data stored elsewhere.

A cache hit occurs when the requested data can be found in a cache, while a cache miss occurs when it cannot. Cache hits are served by reading data from the cache, which is faster than recomputing a result or reading from a slower data store. Thus, the higher the number of requests that can be served from the cache, the faster the system performs.

In Drupal 8, Cache API is used to store data that takes a long time to compute. Caching can either be permanent or valid only for a certain timespan, and the cache can contain any type of data.

How caching stores data in backends & bins

As stated above, caching is not so much an information type in itself as a strategy—or a layer—that improves efficiency of access to existing information types. Any information which needs interpretation in between its raw state and how it's used can have its non-raw states cached: computed arrays of data, rendered HTML markup, or responses from calls to remote APIs.

During the processing of an HTTP request, the information from different sources can be stored in different bins. These cache bins can then each be stored in different, configurable backends.

A backend is a storage mechanism: a SQL database, or Memcache, or even files on disk. Cache backends can even be chained, so that a transient memory cache sits on top of a database cache: the first request from the database cache is slow, but subsequent requests retrieve data from memory.

When you request a cache object, you can specify the bin name in your call to \Drupal::cache(). Alternatively, you can request a bin by getting service "cache.nameofbin" from the container. The default bin is called "default", with service name "cache.default", it is used to store common and frequently used caches.

A module can also define its own cache bin by defining a service in its modulename.services.yml file as follows:

cache.nameofbin:
 class: Drupal\Core\Cache\CacheBackendInterface
 tags:
   - { name: cache.bin }
 factory: cache_factory:get
 arguments: [nameofbin]

Other common cache bins are the following:

  • bootstrap: Data needed from the beginning to the end of most requests that has a very strict limit on variations and is rarely invalidated.
  • render: Contains cached HTML strings like cached pages and blocks; can grow to large size.
  • data: Contains data that can vary by path or similar context.
  • discovery: Contains cached discovery data for things such as plugins, views_data, or YAML discovered data such as library info.

Cacheability metadata

This defines the complete lifecycle of cached data. Cacheability metadata consists of three properties:

  • cache tags: For dependencies on data managed by Drupal, like entities and configuration.
  • cache contexts: For variations, i.e. dependencies on the request context.
  • cache max-age: For time-sensitive caching, i.e. time dependencies.

Cache Tags

The cached data in different bins becomes old and obsolete at some point of time and requires removal from these bins to accommodate the latest changes. Before Drupal 8, there was no way to identify individual pieces of expired data stored in different cache bins in order to only remove those. But in Drupal 8 there was a remarkable shift with the introduction of “Cache Tags”.  

The role of the tags is to identify cache items across multiple bins for proper invalidation. Their purpose is to provide the ability to accurately target multiple cache items that contain data about the same object, page, etc.

In a typical scenario, a user may have modified a node that appears in two views, three blocks, and on twelve pages. Without cache tags, we couldn't know which cache items to invalidate. So we'd have to invalidate everything and sacrifice effectiveness to achieve correctness. With cache tags we can have both.

Example:

// A cache item with nodes, users, and some custom module data.
$tags = array(
 'my_custom_tag',
 'node:1',
 'node:3',
 'user:7',
);
\Drupal::cache()->set($cid, $data, CacheBackendInterface::CACHE_PERMANENT, $tags);

// Invalidate all cache items with certain tags.
\Drupal\Core\Cache\Cache::invalidateTags(array('node:1',  'user:7'));

 

Cache contexts

Let’s assume that a piece of data has multiple variants, and only one variant can be used based on a given situation. Which state of the data should be cached, and how will other variants be cached and called depending on the situation? These are questions we’ve had since the Drupal 7 era, and it had been dealt with but not very effectively.

If you recall, in Drupal 7 and before, it's easy to program any cached item to vary by user, by user role, and/or by page. It could even be configured through the UI for blocks. However, more targeted variations (such as by language, by country, or by content access permissions) were more difficult to program and not typically exposed in a configuration UI.

Looking at these limitations, Drupal 8 cache API was evolved with the concept of “Cache contexts”. We can try to understand this concept with an example. In this example, we will take a look at how cache contexts can be used, as well as important considerations for developers, by modifying the default breadcrumb in Drupal 8.

By default, the breadcrumb only shows the path up to and including the parent page. For many websites, we also need to display the current page title in the breadcrumb. Here we have altered it to add the current page title to the end:

function theme_preprocess_breadcrumb(&$variables) {
 if (($node = \Drupal::routeMatch()->getParameter('node')) && $variables['breadcrumb']) {
   $variables['breadcrumb'][] = [ 'text' => $node->getTitle() ];
 }
}

Now the following page, http://example.com/page-level-1/page-level-2/page-level-3 will have the following breadcrumb:

Home > Page Level 1 > Page Level 2 > Page Level 3

Great! We have the breadcrumb we are looking for. But we haven't considered yet how this might have effected caching.

Additionally at this point we have not changed anything to do with the caching of the breadcrumb. Drupal does not know how it should handle caching for our new version of the breadcrumb, and this can create some unexpected behaviors.

If we look at the cache contexts, which can be configured to be sent along in the response header, we see the breadcrumb is cached on the context of the parent URL:

1-path.parent.png

This means that once Page Level 3 is loaded, the breadcrumb is cached based on the parent URL. When a sibling page like Page Level 3-1 is loaded, the parent page has not changed, and thus the breadcrumb will be loaded from cache. This can give us unexpected behavior due to the missing cache context.

When you load the page while running on a cold cache (a cache that has been cleared or rebuilt), Level 3-1, the proper breadcrumb is displayed.

2-Page level 3-1 correct.png

When you go to the sibling page, Page Level 3, you will notice that the previous breadcrumb is still showing:

Home > Page Level 1 > Page Level 2 > Page Level 3-1

3-page level 3 incorrect.png

When we altered the breadcrumb, we also needed to be aware of the cache settings on it. Since the breadcrumb is now unique to each specific page and not just to the parent page, we need to add the following to our preprocess_breadcrumb function:

function theme_preprocess_breadcrumb(&$variables) {
 if (($node = \Drupal::routeMatch()->getParameter('node')) && $variables['breadcrumb']) {
   $variables['breadcrumb'][] = [
     'text' => $node->getTitle()
   ];
   $variables['#cache']['contexts'][] = 'url.path';
 }
}

With http.response.debug_cacheability_headers enabled, we can now see that the URL.path context is appearing in our response header.

4-url.path.png

This will then use the current path to cache the breadcrumb. Now, when running on a cold cache, we load the page Page Level 3-1 and the proper breadcrumb is shown.

5-page level 3-1 correct.png

When loading the sibling Level 3, the proper breadcrumb is now showing for it as well.

Home > Page Level 1 > Page Level 2 > Page Level 3

6-page level 4 correct.png

This is a quick example of how cache contexts can alter the output of a page, and how developers must take this into consideration when working with Drupal 8.

Cache max-age

Certainly, nothing is permanent and this holds true with data too. `Cache max-age` is used to invalidate time-dependent data. Cache max-age is analogous to HTTP's Cache-Control header's max-age directive.

return [
 '#markup' => $this->t($pageContent),
 '#cache' => [
   'max-age' => 86400, // one day in seconds
 ],
];

Cache metadata bubbling

Bubbleable cache metadata refers to the way that a parent item in a render array receives the cacheability metadata of its children. This is one of the remarkable features of the cache API which enables invalidating a piece of data which has become outdated.

The ‘under the hood’ processing of tag-based content invalidation is illustrated in the following flow diagram:

Node-Page-View.jpg

Using the above flow we can understand how a bubbled tag helps to invalidate the cached data and brings up the latest information. We hope you've found this helpful. 

And if you have any questions, feel free to share in the comments below—we'll get back to you.