Chapter 1. An Introduction to Redis

This chapter discusses some of Redis’s basic concepts. We’ll look into when Redis is a great fit, how to install the server and command-line client on your machines, and Redis’s data types.

When to use Redis

Problem

Nearly every application has to store data, and often lots of fast-changing data. Until recently, most applications stored their data using relational database management systems (RDBMS for short) like Oracle, MySQL, or PostgreSQL. Recently, however, a new paradigm of data storage has emerged from the need to store schema-less data in a more effective way—NoSQL. Choosing whether to use SQL or NoSQL is often an important first step in the design of a successful application.

Solution

There are two important thing to consider when choosing whether to use SQL or NoSQL to store your data: its nature and your usage pattern. Some data is a great fit for a relational storage engine, while other data benefits from the schema-free nature of a NoSQL engine like Redis or its alternatives. If you don’t rely on a particular RDBMS feature and need the performance or scalability of a NoSQL database, that might in fact be the ideal choice. So in order to decide whether your data should be stored in a RDBMS or NoSQL engine, you need to look into a few specific things that will help you make a decision. Also bear in mind that quite often the ideal solution will be to use both.

Are your application and data a good fit for NoSQL?

When working on the web, chances are your data and data model keep changing with added functionality and business updates. Evolving the schema to support these changes in a relational database is a painful process, especially if you can’t really afford downtime—which people most often can’t these days, because applications are expected to run 24/7. As a case in point, in a recent presentation on MongoDB, Jeremy Zawodny of Craigslist mentioned how changing the schema on their database typically takes a two month-long toll on their post archival service.

Examples of data that are a particularly good fit for nonrelation storage are transactional details, historical data, and server logs. These are normally highly dynamic, changing quite often, and their storage tends to grow quite quickly, further compounding the problem of adjusting the schema to store them. They also don’t typically feel “relational”—that is, the data in them doesn’t tend to fan out in relationships to other types of data. That’s a good indication that they can use something other than a RDBMS.

Another way to gauge the fit for NoSQL is to look at whether you find yourself denormalizing your data for performance reasons, and no longer benefit from some of the advantages of a relational system, such as consistency and redundancy checks.

One thing to keep in mind is that NoSQL databases generally don’t provide ACID (atomicity, consistency, isolation, durability), or do it only partially. This allows them to make a few tradeoffs that wouldn’t be possible otherwise. Redis provides partial ACID compliance by design due to the fact that it is single threaded (which guarantees consistency and isolation), and full compliance if configured with appendfsync always, providing durability as well.

Performance can also be a key factor. NoSQL databases are generally faster, particularly for write operations, making them a good fit for applications that are write-heavy.

All this being said, and even though NoSQL feels more flexible, there are also great arguments to be made for storing relational data in a RDBMS. If you have predictable data that is a great fit for normalization, you can reap the benefits of using a relational data storage engine. Always look at the data before making a decision.

Don’t believe the hype

NoSQL databases such as Redis are fast, scale easily, and are a great fit for many modern problems. But as with everything else, it is important to always choose the right tool for the job. Play to the strengths of your tools by looking at what you’re storing, how often you’ll access it, and how data (and its schema) might change over time.

Once you’ve weighted all the options, picking between SQL (for stable, predictable, relational data) and NoSQL (for temporary, highly dynamic data) should be an easy task. Doing this kind of thinking in advance will save you many headaches in future data migration efforts.

There are also big differences between NoSQL databases that you should account for. For example, MongoDB (a popular NoSQL database) is a feature-heavy document database that allows you to perform range queries, regular expression searches, indexing, and MapReduce. You should weigh all the factors when choosing your database. As we said earlier, the questions boil down to what your data looks like and what your usage pattern is.

For example, Redis is extremely fast, making it perfectly suited for applications that are write-heavy, data that changes often, and data that naturally fits one of Redis’s data structures (for instance, analytics data). A scenario where you probably shouldn’t use Redis is if you have a very large dataset of which only a small part is “hot” (accessed often) or a case where your dataset doesn’t fit in memory.

Installing Redis

Problem

You want to install Redis on your computer.

Solution

There are several ways to install Redis on your computer or server, but the best and most flexible option is to compile it yourself. Nevertheless, depending on your distribution or operating system, there are other options.

Discussion

Compiling From Source

Redis evolves very quickly and package maintainers have a hard time keeping up with the latest developments. Since Redis doesn’t have any external dependencies, compilation and installation are very straightforward, so we recommend you do it yourself. Redis should build cleanly in most Linux distributions, Mac OS X, Solaris, and Cygwin on Windows.

  1. Downloading the source

    You can download Redis from the official site or directly from the Github project, either using Git or your browser to fetch a snapshot of one the branches or tags. This allows you get to get development versions, release candidates, etc.

  2. Compiling

    Redis compilation is straightforward. The only required tools should be a C compiler (normally GCC) and Make. If you want to run the test suite, you also need Tcl 8.5.

    After unpacking the source and changing your terminal path to the source directory, just type:

    make

    This will compile Redis, which on a modern computer should take less than 10 seconds. If you’re using a x86_64 system but would like an x86 build (which uses less memory but also has a much lower memory limit), you can do so by passing along 32bit to Make:

    make 32bit

    After compiling Redis, particularly if you’re building a development version, you should run the test suite to ensure that the server is behaving as expected.

    make test
  3. Installing

    After compiling Redis, you can go ahead and run it:

    cd src && ./redis-server

    However, you might find it more convenient to install it to another location in your system. The Makefile wlll also help you do that:

    make install

    This will install Redis binaries to /usr/local/bin. If you wish to install to another location, you can pass it to make. For instance:

    make install /opt/local

    This will install the binaries in /opt/local/bin.

    After installating the Redis server, you should also copy the configuration file (redis.conf) to a path of your choice, the default being /etc/redis.conf. If your configuration file is in a different path from the default, you can pass it along as a parameter to redis-server:

    /usr/local/bin/redis-server /alternate-location-for-redis-config.conf

Installing on Linux

Most modern Linux distributions have Redis packages available for installation, but keep in mind that these are normally not up-to-date. However, if you prefer to use these, the installation procedure is much simpler:

Debian/Ubuntu
sudo apt-get install redis-server
Fedora/Redhat/CentOS
sudo yum install redis
Gentoo
sudo emerge redis

This approach has a few advantages: by using your package management system, you can more easily keep software up-to-date, and you’ll most likely get at least security and stability updates. Besides that, you’ll also get startup scripts and an environment more suited to your distribution (user accounts, log files, database location, etc).

Installing on Windows

Although Redis is not officially supported on Windows for several reasons—notably the lack of a copy-on-write fork()—there is now a native port by Microsoft Open Technologies that implements CoW in userspace and therefore should have acceptable performance.

Beware that for performance and stability reasons, the Windows versions of Redis are not recommended for production use. Consider using a native or virtualized Linux/UNIX environment instead. Despite that, you might find these versions useful for development or testing.

The main disadvantage of using the Microsoft Open Technologies version is that, because it’s a fork of Redis, there is a lag incorporating version updates from the original.

Being a native solution, compilation is simpler as it requires only Visual Studio. You can get the source and follow the project on Github.

ServiceStack also maintains a page with Vagrant boxes that automatically install and start Redis, making them a convenient solution for development purposes.

Installing on Mac OS X

There are several ways to install Redis on Mac OS X. They all require you to have the XCode developer tools installed, which includes libraries and compilers. If you are a developer on a Mac, chances are you already have this package installed. If you don’t, you can either download it from Apple’s developers website or run “Install Developer Tools” on your Mac’s installation DVDs.

You can manually compile Redis from source by following the steps earlier in this chapter. Most people, however, prefer the convenience of a package manager such as Fink, MacPorts, or Homebrew. A Redis package isn’t available on Fink, so we’ll cover the other two.

Installing through MacPorts

MacPorts defines itself as “an easy to use system for compiling, installing, and managing open source software.” It is based on the FreeBSD Ports system, and to a large extent can be used in the exact same way.

In order to install Redis through MacPorts, you need to first install the package management system. There’s an extensive guide on how to do that at guide.macports.org. Once you’ve installed MacPorts, installing the Redis package is as simple as:

port install redis

Since Redis has no direct dependencies, the actual compilation and installation process is quite speedy. You will then be ready to start using Redis.

Installing through Homebrew

Homebrew is the latest entrant in the Mac package management scene. Being relatively new means that not every package you might be looking for is available on it—even though they make contributions very easy—but if you’re looking for a tool that developers use often, chances are that it’s going to be available through a Homebrew recipe.

You can install Homebrew by following the detailed instructions available over at Github, but it is usually as simple as running the following command:

ruby -e "$(curl -fsSLk https://gist.github.com/raw/323731/install_homebrew.rb)"

Once that’s done, you’ll be ready to install packages using the Homebrew recipes system. Installing Redis is just a matter of typing:

brew install redis

You can then run redis-server manually or install it into the Mac’s own LaunchServices so that it starts when you reboot your computer. You can edit the configuration file /usr/local/etc/redis.conf to tweak it to your liking, and then start the server:

redis-server /usr/local/etc/redis.conf

Using Redis Data Types

Problem

You need to understand Redis data types in order to make better use of them for specific applications.

Solution

Unlike most other NoSQL solutions and key-value storage engines, Redis includes several built-in data types, allowing developers to structure their data in meaningful semantic ways. Predefined data types add the benefit of being able to perform data-type specific operations inside Redis, which is typically faster than processing the data externally. In this section, we will look at the data types Redis supports, and some of the thinking behind them.

Discussion

Before we dive into the specific data types, it is important to look at a few things you should keep in mind when designing the key structure that holds your data:

  • Be consistent when defining your key space. Because a key can contain any characters, you can use separators to define a namespace with a semantic value for your business. An example might be using cache:project:319:tasks, where the colon acts as a namespace separator.

  • When defining your keys, try to limit them to a reasonable size. Retrieving a key from storage requires comparison operations, so keeping keys as small as possible is a good idea. Additionally, smaller keys are more effective in terms of memory usage.

  • Even though keys shouldn’t be exceptionally large, there are no big performance improvements for extremely small keys. This means you should design your keys in such a way that combines readability (to help you) and regular key sizes (to help Redis).

With this in mind, keys like c:p:319:t or user 123 would be bad—the first because it is semantically crude, and the latter because it includes whitespace. On the other hand, keys like cache:project:319:tasks, lastchatmessage, or 464A1E96B2D217EBE87449FA8B70E6C7D112560C are good, because they’re semantically meaningful. Note that the last example of an SHA1 hash is, while hard to guess and predict, semantically meaningful and quite useful if you are storing data related to an object for which you can consistently calculate a hash.

Strings

The simplest data type in Redis is a string. Strings are also the typical (and frequently the sole) data type in other key-value storage engines. You can store strings of any kind, including binary data. You might, for example, want to cache image data for avatars in a social network. The only thing you need to keep in mind is that a specific value inside Redis shouldn’t go beyond 512MB of data.

Lists

Lists in Redis are ordered lists of binary safe strings, implemented on the idea of a linked list. This means that while getting an element by a specific index is a slow operation, adding to the head or tail of the data structure is extremely fast, as it should be in a database. You might want to use lists in order to implement structures such as queues, a recipe for which we’ll look into later in the book.

Hashes

Much like traditional hashtables, hashes in Redis store several fields and their values inside a specific key. Hashes are a perfect option to map complex objects inside Redis, by using fields for object attributes (example fields for a car object might be “color”, “brand”, “license plate”).

Sets and Sorted Sets

Sets in Redis are an unordered collection of binary-safe strings. Elements in a given set can have no duplicates. For instance, if you try to add an element wheel to a set twice, Redis will ignore the second operation. Sets allow you to perform typical set operations such as intersections and unions.

While these might look similar to lists, their implementation is quite different and they are suited to different needs due to the different operations they make available. Memory usage should be higher than when using lists.

Sorted sets are a particular case of the set implementation that are defined by a score in addition to the typical binary-safe string. This score allows you to retrieve an ordered list of elements by using the ZRANGE command. We’ll look at some example applications for both sets and sorted sets later in this book.

Get Redis Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.