In this chapter, we’ll try to focus on recipes related to operating Redis servers, instead of programming applications or data modeling. These tasks vary widely, but include starting a Redis slave, upgrading an existing server, performing backups, sharding, and handling a dataset larger than your available memory.
One of the advantages of Redis over other key/value stores like
memcached is its support for
persistence—in fact, it even comes preconfigured with this support. This
functionality enables you to perform some operations that wouldn’t be
possible otherwise, like upgrading your server without down time or
Nevertheless, persistence should be configured in a way that suits your dataset and usage patterns.
The default persistence model is snapshotting, which consists of saving the entire database to disk in the RDB format (basically a compressed database dump). This can be done periodically at set times, or every time a configurable number of keys changes.
The alternative is using an
File (AOF). This might be a better option
if you have a large dataset or your data doesn’t change very
As previously stated, snapshotting is the default persistence mode for Redis. It asynchronously performs a full dump of your database to disk, overwriting the previous dump only if successful. Therefore, the latest dump should always be in your dbfilename location.
You can configure snapshotting using
save seconds keys-changed statements in your
configuration file, in the following format:
The snapshot will occur when both conditions match. A typical
example that ensures that all your data is saved every few minutes is:
save 600 1 which will perform a
snapshot every 10 minutes if any key in your server has
You can manually trigger snapshotting with the
BGSAVE forks the main Redis process
and saves the DB to disk in the background. Redis executes this
operation itself if you have SAVE statements in your configuration
SAVE performs the same operation as
BGSAVE but does so in the
foreground, thereby blocking your Redis server.
If you come to the conclusion that snapshotting is putting too
much strain on your Redis servers you might want to consider using
slaves for persistence (by commenting out all the
save statements in your masters and enabling
them only on the slaves), or using AOF instead. In particular, if you
have a big dataset or a dataset that doesn’t change often, consider
Append Only File
persistence mode keeps a log of the commands that change your dataset
in a separate file. Like most writes on modern operating systems, any
data logged to AOF is left in memory buffers and
written to disk at intervals of a few seconds using the system’s
fsync call. You can configure how
often the AOF gets synched to disk by putting
appendfsync statements in your
configuration file. Valid options are
fsync is not
safe, as it leaves the decision to your operating system about when
to actually write the data to disk.
AOF can be used together with snapshotting. But you might decide to suppress snapshots because they put too much load on your server. If you’re not snapshotting, be sure to write the AOF to a RAID array or have at least one Redis slave that you can recover data from in case of disaster.
BGREWRITEAOF rewrites the AOF to
match the current database. Depending on how often you update
existing data, this will greatly reduce the size of the AOF. If your
data changes very often, the on-disk file will grow very fast, so
you should compact it by issuing
BGREWRITEAOF regularly. The rewrite is
done in the background.
Database slaves are useful for a number of reasons. You might need them to load-balance your queries, keep hot standby servers, perform maintenance operations, or simply inspect your data.
Redis supports master-slave replication
natively: you can have multiple slaves per master and slaves connecting
to slaves. You can configure replication on the configuration file
before starting a server or by connecting to a running server and using
In order to configure a Redis slave using the configuration file, you should add add the following to your redis.conf:
Start or restart the server afterwards. Should your Redis master have password authentication enabled, you’ll need to specify it as well:
If you want to turn a running Redis server into a slave (or switch
to a different master), you can do it using the
As in the previous example, if you’re using authentication, you’ll need to specify it beforehand:
CONFIG SET masterauth
Keep in mind that should your server restart, this configuration will be lost. Therefore, you should also commit your changes to the configuration file.
Often you might find yourself with a dataset that won’t fit in your available memory. While you could try to get around that by adding more RAM or sharding (which in addition would allow you you to scale horizontally), it might not be feasible or practical to do so.
Redis has supported a feature called Virtual Memory (VM) since version 2.0. This allows you to have a dataset bigger than your available RAM by swapping rarely used values to disk and keeping all the keys and the frequently used values in memory. However, this has one downside: before Redis reads or performs an operation on swapped values, they must be read into real memory.
If you decide to use VM, you should be aware of its ideal use cases and the tradeoffs you’re making.
The keys are always kept in memory. This means that if you have a big number of small keys, VM might not be the best option for you or you might have the change your data structure and use large strings, hashes, lists, or sets instead.
VM is ideal for some patterns of data access, not all. If you regularly query all your data, VM is probably not a good fit because your Redis server might end up blocking clients in order to fetch the values from disk. VM is ideally suited for situations when you have a reasonable amount of frequently accessed values that fit in memory.
Doing a full dump of your Redis server will be extremely slow. In order to generate a snapshot, Redis needs to read all the values swapped to disk in order to write them to the RDB file (see Configuring Persistence). This generates a lot of I/O. Due to this, it might be better for you to use AOF as a persistence mode.
VM also affects the speed of replication, because Redis masters need to perform a
BGSAVEwhen a new slave connects.
Still, there are scenarios where using VM makes sense. In order to enable it, you’ll need to add this to your configuration file:
There are other settings that you should pay attention to when enabling VM:
vm-swap-filespecifies the location of the swap file in your filesystem.
vm-max-memoryallows you to specify the maximum amount of memory Redis should use before beginning to swap values. Beware that this is a soft limit, because keys are always kept in memory and because Redis won’t swap values to disk while creating a new snapshot.
vm-pagesspecifies the number of pages in your swap file.
vm-page-sizedefines the size of a page in bytes. The page size and the number of pages are very important, because Redis won’t allocate more than one value to the same page, so together these determine the amount of data your swap file can handle.
vm-max-threadsis the maximum number of threads available to perform I/O operations. Setting it to
0enables blocking VM, which means that your Redis server will block all clients when it needs to read a value from disk. Once again, depending on your data access patterns, this may or may not be the best option.
As with any other disk-based database, Redis VM will perform better the faster your I/O is. So the use of SSDs such as Flash is encouraged. You can read more about VM use cases, configuration details, and tradeoffs in the official Redis documentation.
At some point in the life of your system you might need to upgrade Redis. Unfortunately, Redis can’t do online binary upgrades, and doing a server restart means that your application won’t be able to talk to Redis for a (possibly long) period of time. But that doesn’t mean that there aren’t other ways to achieve it without incurring downtime. You might also want to move your current Redis database to another system for maintenance purposes, a hardware upgrade, etc.
Our solution will involve starting a new Redis server in slave mode, switching over the clients to the slave and promoting the new server to the master role. To make the example easier to understand, let’s assume we have a Redis server listening on port 6379.
It might be easier to start the slave on a new server than on the existing one. This is because of memory requirements, and because you can reuse the same configuration file, directories, and port for the slave, changing only the hostname or IP address.
Install the new Redis version without restarting your existing server.
Create a new redis.conf, specifying that Redis runs on port 6380 (assuming you’re on the same system—if you’re not, you can still use 6379 or any other available port) and a different DB directory (you don’t want to have 2 Redis servers reading or writing the same files).
Start the new server.
Connect to the new server and issue the command:
SLAVEOF localhost 6379
This will trigger a
BGSAVEon the master server, and upon completion the new (slave) server will start replicating. You can check the current status using the
INFOcommand on the slave. When you see
master_link_status:up, the replication is active.
Since your new Redis server is now up-to-date, you can start moving over your clients to this new server. You can verify the number of clients connected to a server with the
INFOcommand; check the
When all your clients are connected to the slave server, you still have two tasks to complete: disable the replication and shut down the master server.
INFO returns information about the
server including replication status, uptime, memory usage, number of
keys per database and other statistics.
SLAVEOF NO ONE
This will stop replication and effectively promote your slave into a master. This is important in Redis 2.2. as master servers are responsible for sending expirations to their slaves.
The old master server will perform a
Your new Redis system is up and running, but make sure that all your configuration files, init scripts, backups, etc. are pointing to the right location and starting the correct server. It’s easy to forget those routine operations, but you should at the very least certify that nothing wrong will happen in case of a server restart.
Doing an online upgrade has a couple of (possibly steep) requirements: you need to able to point your Redis clients to another server, either by use of a proxy, by having failover built-in to your clients (so that they connect to a different server once you bring the master down), or just by simply tell them to connect to another server. You’ll also need to have at least twice as much memory available (possibly in a different system).
Beware that doing this might be dangerous, depending on how
different the Redis versions you are upgrading from and to. At the very
least, it should be safe for updates of minor versions of Redis. For
major upgrades, each has caveats. For example, upgrading from 2.0 to 2.2
should be fine so long as you don’t use
EXPIRE, since the way expirations are handled
during expiration changed between these versions. Like every other
maintenance operation, make sure to test before doing it on your
One issue comes up frequently when talking about NoSQL databases is backing up your data. The notion that these are hard to back up, however, is mostly a misperception since most of the techniques that you’d use to backup a relational database can also be used for NoSQL databases.
If, for some distributed databases, grabbing a point-in-time snapshot of your data might be tricky, this is certainly not the case with Redis. In this section, we’ll explain how to achieve it depending on which Redis persistence model you’re using. We’ll assume you’re running your servers on Linux, although filesystem-specific functionality might also be available for other platforms.
Our proposed solution is heavily dependent on your Redis persistence model:
With the default persistence model (snapshotting), you’re best off using a snapshot as a backup.
If you’re using only AOF, you’ll have to back up your log in order to be able to replay it on startup.
If you’re running your Redis in VM mode, you might want to use an AOF log for the purpose of backups, as the use of snapshotting is not advised with VM.
It’s up to you to store your backup properly. Ideally, you’ll store at least a couple of copies of it, have at least one offsite, and do it in a fully automated way. We’ll try to explain how to do backups for the different persistance models, but be sure to test your own procedures. Be sure to also test your recovery procedures regularly.
Keep in mind that backing up your data might increase the strain on your production systems. It’s probably a good idea to perform the backups on a slave Redis instance, and to actually have slaves running at all times because promoting a new server to master is probably quicker than restoring a backup.
Snapshotting is the default Redis persistance model. As mentioned earlier, depending on your settings, Redis will persist its data to disk if m keys changed in n seconds. When using this persistence mode, performing a backup is really simple. All you have to do is copy the current snapshot to another location.
Use a copy, not a move, because if Redis crashes and restarts and the snapshot is not there, you will end up losing all your data!
and then waiting for the dump file to be updated. Be sure to compress the snapshot before backing it up. That will probably reduce its size by at least a factor of 10.
Restoring a snapshot file is also quite simple. Simply shut down
the server, put the snapshot you want to restore in the
dbfilename location configured by redis.conf, and then start the server. This
order is important, because when Redis shuts down, it performs a
snapshot, thus overwriting this file.
If you’re using the AOF as the only persistence mode (you can also use it together with snapshotting) the easiest way to do a backup is still to perform use a snapshot as described in the previous section. However, if you’re using AOF, you’re most likely worried about losing data between snapshots. You may also be avoiding snapshots because they put too much load on your server.
In order to recover when using the AOF, just do the same procedure you would for snapshotting, but instead put your backup in the AOF location. On startup, Redis will simply replay the log.
Be sure to remember to run
BGREWRITEAOF regularly if you’re using
Should your Redis server refuse to start due to a corrupted
AOF—which can happen if the server crashes or is killed while writing
to the file—you can use the
redis-check-aof utility to fix your
redis-check-aof --fix filename
If you are running Redis in VM mode, be sure to understand the tradeoffs. Starting or stopping your server will take a long time if you have a big dataset. Performing a snapshot in order to back up your data might also take a long time. Nevertheless, if your Redis instances are running with VM enabled, you should still perform backups. But you’re probably best doing it in a slave that is not too busy serving requests.
If you have a big database, using
BGSAVE to perform a snapshot is probably not
feasable. You’re most likely better off using AOF and rewriting it at
regular intervals (depending on how often your data changes, but you
probably don’t want to do this too often). Beware that while you’re
BGREWRITEAOF, Redis will not write to the
VM. Therefore your memory usage might increase while it’s processing
these background operations.
Since you’re backing up your AOF, the restoring procedure is exactly the same as in the previous section: just copy over your AOF and start Redis.
Sharding is a horizontal partitioning tecnique often used with databases. It allows you to scale them by distributing your data across several database instances. Not only does this allow you to have a bigger dataset, as you can use more memory, it will also help if CPU usage is the problem, since you can distribute your instances through different servers (or servers with multiple CPUs).
In Redis’s case, sharding can be easily implemented in the client library or application.
Since Redis Cluster is still under development and should only be released sometime later in 2011—with a beta most likely arriving in the summer—sharding is a useful tecnique for scaling your application when your data no longer fits in a single server.
Currently there are three possibilities when it comes to sharding Redis databases:
- Use a client with built-in sharding support
At this point, most Redis clients don’t support sharding. Notable exceptions are:
Predis, a PHP client Redisent, a PHP client Rediska, a PHP client Jedis, a Java client scala-redis - a Scala client.
- Build sharding support yourself on top of an existing client
This involves some programming that might not be too hard if you understand your dataset and applications thoroughly. At the very least, you’ll have to implement a partitioning rule and handle the connections to the different servers.
- Use a proxy that speaks the Redis protocol and does the sharding for you
Redis Sharding is a multiplexed proxy that provides sharding to any Redis client. Instead of connecting directly to your Redis servers, you start a proxy and connect to it instead. Unfortunately at this moment, sharding doesn’t support resharding on the fly, so you’ll be unable to change the configuration of the cluster with the proxy running.
If you decide to implement sharding yourself, you should probably use consistent hashing. This will ensure a minimal amount of remapping if you add or remove shards.
Sharding doesn’t remove the need for replication. Make sure your cluster is redundant so that the loss of a server doesn’t imply any loss of data. Jeremy Zawodny described on his blog the setup used at Craiglist, and Salvatore has written on the subject as well.
Something else to keep in mind is that (depending on your implementation) you will not be able to perform some operations that affect multiple keys, because those keys might be in different shards (servers). If you rely on these operations, you’ll need to adjust your hashing algorithm to ensure that all the required keys will always be in the same shard.