Chapter 4. Swift Basics
In Chapter 3 you learned about the architecture of Swift and how it stores data. In this final chapter of Part I, we’ll introduce you to the basics of accessing data in Swift. We’ll show you how to use cURL, a command-line tool, to perform simple operations on your Swift cluster. In Part II of this book we will go more deeply into the API, introduce a few client libraries, explore some of Swift’s advanced features, and learn how to build middleware with Swift.
Talking to the Cluster: The Swift API
Swift, like its animal totem, spends much of its time in motion. There are incoming requests to put data into the cluster, which result in multiple copies of the data being written to different nodes. There are also requests to get data out of the cluster, perhaps to restore a backup or to serve up content to websites or online games. For each request, Swift has to check who is making the request and whether it is allowed, before handling the request itself and responding. In addition to all that, there are the different server and consistency processes swarming behind the scenes, looking after the data. All this activity requires communication and coordination.
As mentioned in the previous chapter, the proxy server process is the only part of the Swift cluster that communicates with external clients. This is because only the proxy server process implements the Swift API. At this point, we can simply say that the Swift API is an HTTP-based set of rules and vocabulary that the proxy server process uses when communicating with external clients. We will cover the Swift API in greater detail in Chapter 5; for our purposes in this chapter, the salient point is that the proxy server process is the only process that can communicate outside the cluster, and that when it does, it listens for and speaks HTTP.
Using HTTP means that communication is done in a request-response format. Each request has a desired action (for example, uploading an object) that is expressed with an HTTP verb. Each response has a response code (for example, “200 OK”) that indicates what happened to the request. Because the proxy server process communicates in HTTP, so should users that want to communicate with it. Let’s look at (1) sending requests, (2) authorization and action, and (3) responses.
Sending a Request
- Storage URL
- Authentication information
- HTTP verb
- Optional: any data or metadata to be written
The format of the storage URL should seem familiar to anyone who has visited a website. For example, a storage URL for an object in a Swift cluster might look like this:
Storage URLs have two basic parts:
Cluster location (
- The first part of the storage URL is an endpoint into the cluster. It is used by the network to route your request to a node with a proxy server process running so your request can be handled.
Storage location (
The storage location is composed of one or more identifiers that make up the unique location of the data. The storage location might be one of three formats, depending on which resource you are trying to reach with your request:
Object names might contain a slash character (“/”), so pseudo-nested directories are possible.
Tools using Swift offer one of two ways to handle authentication:
- Passing the authentication credentials in with the request each time
- Passing in an authentication token that you obtain by making a special authentication request before making any storage requests
This is similar to visiting a company and having the option to present your ID at the check-in desk each time you enter the building, or get a daily visitor’s badge that identifies you as someone who can come and go for the day without showing your ID again.
As Swift confirms that you are who you say you are, it also notes what data you are allowed to have access to. Swift uses this information once it knows what the request is for.
There is much more to say about both authentication and authorization, which is why we go into greater detail about them in Chapter 13. For now, the important thing to understand is that each storage request must include authentication information (either credentials or a token) because Swift will verify authentication each time.
- Downloads objects (with metadata), or lists the contents of containers or accounts.
- Uploads objects, creates containers, or overwrites metadata headers.
- Updates metadata (accounts or containers), overwrites metadata (objects), or creates containers if they don’t exist.
- Deletes objects or empty containers.
- Retrieves header information, including the metadata, for the account, container, or object.
After successful authentication, Swift will examine a request to determine the storage location and action that is being requested, so it can check whether the request is authorized. Let’s look a little closer at authorization.
Authorization and Taking Action
Although a user might have valid credentials to access a Swift cluster, he might not be authorized to be taking the action (HTTP verb) that he sent in the request. The proxy server will need to confirm authorization before allowing the request to fulfilled.
For example, if you send in a request with your valid authentication information, but try to add an object to someone else’s account, you will be authenticated by the system but the request will be rejected because you are not authorized to write to someone else’s account.
If the action is authorized, the proxy server process will call on the correct nodes to fulfill the request. The nodes will return the results of the request, which the proxy server process then sends back to you as an HTTP response.
Getting a Response
- Response code and description
- Data (optional, depending on request type)
- 1xx (Informational)
- Example: 100 Continue
- 2xx (Success)
- Example: 200 OK
- Example: 201 Created
- 3xx (Redirection)
- Example: 300 Multiple Choices
- Example: 301 Moved Permanently
- 4xx (Client error)
- Example: 400 Bad Request
- Example: 401 Unauthorized
- 5xx (Server error)
- Example: 500 Internal Server Error
From this list you can tell that the 2xx codes are a good sign that your request was fulfilled. The headers that accompany the response might provide further information. If the request was to
GET data, the data should be returned as well.
As you can see, although the request-response format is fairly simple (
200 OK or
401 Unauthorized) there are a lot of moving parts that surround it to
ensure that you can communicate with the cluster.
Now that we have gotten an overview of requests and responses, let’s look at some of the tools you can use to make and receive them.
Users and administrators often carry out the HTTP request-response communication using a client application. Swift client applications can take many forms, from tools such as command-line interfaces (CLIs) to sophisticated graphical user interfaces (GUIs) and web-based software.
A CLI is all you need to perform simple operations on a Swift cluster. Because the native language of the Swift system is HTTP, a command-line tool such as cURL (shipped with most Unix-like operating systems) is the best way to communicate with Swift at a low level.
However, sending HTTP requests one at a time, and extracting all the relevant information from the results, can be a bit tedious. For this reason, most people prefer to use Swift at a higher level. For example, developers use Swift client libraries (discussed in Chapter 6) instead of making the underlying HTTP calls themselves.
We take an in-depth look at those higher level communications in the next part of this book; first, however, let’s look at basic communication with Swift.
We will show commands from two CLIs, cURL and Swift. Both of these allow a user to send requests, one line at a time, to a Swift cluster. We’ll discuss cURL first to show what the HTTP communication between a client and a Swift cluster looks like. Then we’ll show you the Swift_CLI, which trades away some of the functionality that cURL provides, in order to offer a smaller but more human-readable set of commands.
Client for URLs (cURL) is a popular command-line tool for transferring data to and from a server using the URL syntax. It is often preinstalled on systems or is easily installed on the command line. cURL provides detailed control over HTTP requests, so it can handle all possible Swift requests. Because cURL takes the HTTP verbs explicitly, it is often used to provide examples for Swift.
cURL requests include the following:
-Xoption to provide the HTTP verb (e.g.,
Authentication information (for now, we’ll just represent that with
[…], but see Chapter 5 for the basics of authentication with cURL)
- Storage URL
- Data and metadata (optional)
curl -X <HTTP-verb> [...] <Storage-URL> <object.ext>
Let’s look at some sample HTTP
GET requests for a user named Bob to see how cURL would be used for objects, containers, or accounts. One common way to use
Swift is where every user has exactly one account. We will use that model
here, so the storage URL for Bob’s Swift account might be http://swift.example.com/v1/AUTH_bob. So, for this example, here are the
ways Bob would perform some common tasks in Swift:
curl -X PUT [...] http://swift.example.com/v1/AUTH_bob/container2
curl -X GET [...] http://swift.example.com/v1/AUTH_bob
curl -X PUT [...] http://swift.example.com/v1/AUTH_bob/container1 -T object.jpg
curl -X GET [...] http://swift.example.com/v1/AUTH_bob/container1
curl -X GET [...] http://swift.example.com/v1/AUTH_bob/container1/object.jpg
We’ll go into much greater detail about how to use the cURL command with Swift in Chapter 5.
The Swift CLI is part of the
python-swiftclient package and can be installed on any computer running Python 2.6 or 2.7. Detailed installation instructions can be found on the OpenStack website.
Just as the cURL CLI uses the
curl command, the Swift CLI uses the
swift command. The
swift command simplifies things for users, by saving some typing and making several common types of requests easy. However, this simplification comes at a cost: the Swift CLI (the command-line tool) is not able to do everything that Swift (the storage system) can. There are some types of HTTP requests that
the Swift CLI does not yet know how to send.
One reason the
swift command is popular is because it provides users with human-friendly verbs (
upload instead of
PUT) to use when communicating with a cluster. It then translates the commands into the appropriate HTTP verbs.
One drawback with the Swift CLI is that is requires you to pass in your authentication information with each command.
Swift CLI subcommands
The commands you type in to the Swift CLI and the corresponding HTTP requests it issues are:
- Delete a container or objects within a container
- Download objects from containers
- List containers in an account or objects in a container.
- Update metadata for account, container, or object (may also be used to create a container)
- Display header information for account, container, or object
- Upload files or directories to a container
Let’s look again at the HTTP
GET requests from the cURL section, this time using the
swift command requires the username, password, and authentication URL to be passed in with each request, it would look like this:
swift download -U myusername -K mysecretpassword \ -A https://swift.example.com/auth/v1.0 \ http://swift.example.com/v1/AUTH_bob/container1/object.jpg \
We will represent that authentication information with
[…] to make the commands easier to read.
- List all containers in an account
swift list [...] http://swift.example.com/v1/AUTH_bob
- List all the objects in a container
swift list [...] http://swift.example.com/v1/AUTH_bob/container1
- Download an object
downloadsubcommand sends a
GETrequest to the object’s storage location:
swift download [...] http://swift.example.com/v1/AUTH_bob/container1 object.jpg
Custom Client Applications
Application developers can construct HTTP requests and parse HTTP responses using their programming language’s HTTP library, or they may choose to use open source Swift libraries to abstract away the details of the HTTP interface.
Open source client libraries for Swift are available for most modern programming languages, including:
A client uses the Swift API to make an HTTP request to
PUTan object into an existing container. After receiving the
PUTrequest, the proxy server process determines where the data is going to go. The account name, container name, and object name are all used to determine the partition where this object will live. A lookup in the appropriate ring is used to map the storage location (
/account/container/object) to a partition, and to the set of storage nodes where each replica of the partition is assigned.
The data is then sent to each storage node, where it is placed in the appropriate partition. Once the majority of the writes have succeeded, the proxy server process can then notify the client that the upload request succeeded. For example, if you are using three replicas, at least two of the three writes must be successful. Afterward, the container database is updated asynchronously to reflect the new object in the container.
A request comes in to the proxy server process for
/account/container/object. Using a lookup in the appropriate ring, the partition for the request is determined, along with the set of storage nodes that contains that partition. A request is sent to the storage nodes to fetch the requested resource. Once the request returns the object, the proxy server process can return it to the client.
In this chapter, we covered how to access a Swift cluster via the Swift API using command-line client tools (cURL and Swift). We also mentioned that custom client applications can be developed for the API with client libraries for popular programming languages. This introduction will serve you well as we explore using Swift in more depth in the following chapters.