Scaling Memcached

Skyscanner Engineering on 2017-11-13

Engineering at scale

By Simon Fry

When building services on the web the need often arises to run computationally expensive processes. It can be necessary to repeat the same expensive process for every user who visits. In such cases it’s a good idea to use a cache to store these results, in order to give users the fastest possible response.

Memcached is one of the most common out of process caches available, trusted by developers for its maturity and simplicity. On the surface there isn’t much to it; it’s a key-value store, it keeps the data in memory for speed, but it doesn’t persist over reboots. Digging deeper, there are all sorts of settings to tweak, and strategies to pick for when space runs out, for example the size of storage blocks or what to do when available storage space runs low.

What happens when we want to store a lot in the cache? It might be difficult to provision more memory on the machine which is running Memcached, so you’d like to scale horizontally. However, Memcached doesn’t know how to form clusters, so this functionality is provided in many of the popular Memcached clients available today. They work by mapping each cache key to one of the nodes in the cluster. If we know the key we wish to access, we can calculate where in the cluster it’s stored.

Let’s take this one step further. In the cloud computing world we now live in, we don’t just want to scale horizontally, but to dynamically scale up and down as dictated by demand. This could apply to a cache; machines with large amounts of memory can be expensive, and if we’re not storing data then it’s a waste. Is it possible to scale a Memcached cluster based on the amount of data we want to store, and what are the implications of this?

Firstly, is it possible? Yes — if we want to have more capacity in our cache, we could just add a machine, and let the clients know that a new machine is available. However, remember that cache machines are dumb, and the client is generating the map. If we recalculate the map, we don’t want the new map to place a key on a different node than before. If it does then clients asking for the key are going to return a miss; it’ll be as if the value was deleted. This isn’t ideal, we want as many (if not, ideally, all) of the previous values we had to still be available. How do we optimise this to get the maximum number of retained items?

The solution to that question is in the algorithm used for mapping. There are two main options: the modulus map, and the consistent hash.

The modulus map is pretty simple, the key is mapped to an integer, and the modulus of that integer using the number of nodes available gives the index of the node the key is stored on. Assuming the map from key to integer creates a uniform spread of integers, this will evenly distribute the keys across the available machines. However, this algorithm really doesn’t fit with scaling. To demonstrate this, take a look at the worked example below.

This compares which node a key is placed on in a 3 or 4 node cluster. We only need to work this out up to 12 because the pattern then repeats. This shows for changing the size by 1 node, we get 1 unmoved location at the end, and T-2 unmoved locations at the beginning, where T is the larger cluster size being compared. The ratio of moved keys in total results in (T-1)/T.

The result for adding or removing one node is that loss of cache items is (T-1)/T, where T is the old number of nodes in the scaling down scenario, or the new number of nodes when scaling up. This means that as the cluster size increases, so does the loss; from 50% when going from 1 to 2, to 90% when going from 9 to 10. This is hardly ideal for a horizontally scaling cache.

The alternative, the consistent hash, is designed to solve this problem. For each cache node it creates several hundred random integers in a range, seeded by information from the node (such as host name). These points are then placed on a ring defined by the range. When you wish to find a specific key, the key maps to a point on the ring. Then, you move along the ring until you find the next integer defined by a cache node, and that’s the cache node the key is placed on.

An example of a consistent hash ring. Each node only has one point on the circle for clarity. When a key is mapped to the ring, the next clockwise point is the node the key is located on.
Another node is added to the example ring above. Most of the keys are unmoved, except a portion which previously were on node 3. In this example the weights of the nodes are very unbalanced, but with more points per cache node, the proportion of keys on a node tends towards 1/T, where T is the number of nodes in the ring.

When adding an extra node with this algorithm, we place the points of the new node onto the ring. A key will only change placement if a point of the new node appears in the gap between where the key falls on the ring, and the next node point. As more cache nodes enter the ring, these gaps become smaller, and it becomes less likely for a new node to shift where a key is placed. The result is that the loss of scaling up or down is only approximately 1/T. This means the loss is less significant when the cluster is bigger, which is much better suited to horizontal scaling.

If we are using memcached in systems where the size of the cache may change, and therefore we may want to scale the cache in the future, then it’s advisable to use consistent hashing to calculate key placement. Popular clients for many languages allow you to select consistent hashing, as described in more details on the AWS Elasticache Memcached best practices page.

SEE the world with us

Many of our employees have had the opportunity to take advantage of our Skyscanner Employee Experience (SEE) — a self-funded, self-organized programme to work up to 30 days during a 24 month period, in some of our 10 global offices. There is also the opportunity to work for 15 days per year from their home country, if an employee is based in an office outside of the country they call home.

Like the sound of this? Look at our current Skyscanner Product Engineering job roles.

Some of the team at Skyscanner

About the author

My name is Simon Fry, a Software Engineer working in the Hyperdrive Squad in London. We work on Skyscanner’s flight product, making it easier and quicker than ever to find you the right flights and get them booked. Outside of work I love being a tourist in new places, and seeing what new experiences (and foods) the world has to offer.

Simon, exploring Barcelona, in search of tasty paella