Mediated Access to S3 with JetS3t

The S3 service can be a very effective platform for sharing information, when its simple access control mechanisms meet your needs; but the level of control possible with the service’s ACL settings may not always be sufficient. Some scenarios are difficult or impossible to achieve with ACL settings alone, such as if you wish to make your S3 storage available to your customers or colleagues to use when they do not have their own AWS account. In such cases you may need to provide your own intermediate service to mediate access to your S3 storage.

In this section we will demonstrate how to use tools available in the JetS3t Java library to mediate third-party access to your S3 storage. These tools include a client-side application, for interacting with S3 to upload and download files, and a server-side Gatekeeper component that decides whether the client, or user, should be authorized to perform these operations.

Note

Disclaimer: The JetS3t project was created by the author of this book.

There are a number of ways you could share your S3 storage with others. Let us take a look at a few of the options to see why we think the JetS3t tools are worth considering.

Public write permission via an ACL

The simplest way to allow third parties to upload files to your S3 buckets is to grant write permission to the general public. If you apply this ACL setting, anyone with S3 client software can upload files into the bucket and replace or delete existing objects. This makes it easy to grant access to others, but the disadvantages of this approach should be clear: anyone can upload, replace, or delete objects in your bucket.

If you grant public write access to a bucket, you cede a great deal of control over what happens in your S3 account. You make it possible for malevolent individuals to use your account to store and distribute their own files, and you make yourself vulnerable to the risk that such individuals will overwrite the files stored in your bucket by legitimate users.

You could make your bucket less of a target by not granting read access to the public. In that case, the bucket acts as a drop-box into which anyone can upload their files, but they cannot obtain a listing of the files that are stored there. Those who knew the bucket name could still store and distribute their own files by making the objects they create publicly accessible. This approach is less risky than allowing complete public access, but it is still far from safe.

Intermediate Relay Server

A better way to share your S3 storage with others, while maintaining control over how it is used, is to provide your own server and software to act as a middleman between the client and S3. In this arrangement your server would allow trusted clients to log in and upload files, which the server would then relay to S3 for long-term storage.

This approach has a number of advantages. You can exert a great deal of control over who is able to access the server, using whatever authentication mechanism you prefer. You can use the server’s disk as a short- or long-term cache for files, so you are not wholly dependent on the S3 service being available. And your server can provide a simpler interface than that offered by S3; for example, it could accept file uploads via protocols not supported by S3 such as FTP or WebDAV.

There are also disadvantages to this approach. If your server is likely to handle a large amount of traffic, it will need enough processing power to receive and relay all the data, enough bandwidth to cope with the traffic, and enough disk space to store the files until they have been written to S3. This can lead to exactly the kind of infrastructure problems that the S3 service was designed to avoid. Worst of all, you will be paying double for bandwidth, because files must be uploaded twice: first to your intermediate server, then to S3; though you could minimize these fees by running your server in Amazon’s EC2 service.

Gatekeeper server

The third option for providing mediated access to your S3 storage, and the one we will pursue here, is to provide your own authorization server that acts as a gatekeeper to your S3 account. This server will receive requests from clients who wish to perform an operation on your account, and it will allow or deny this request based on criteria you define. If the client’s request is allowed, the server will send the client a preapproved, signed URI that the client’s software can use to interact with S3 to perform the operation. The Gatekeeper server only authorizes operations; it does not act as a mediator for the actual data transfer.

Like the other approaches, this option has some disadvantages. It requires that you run your own gatekeeper authorization server to generate signed URIs for clients, and it also requires that your clients use specialized software that can communicate with the gatekeeper and interact with S3, using signed URIs instead of AWS credentials.

Despite these drawbacks, this approach offers compelling advantages. Because the client’s software interacts directly with S3 to upload or download files, your bandwidth expenses are less than they would be with an intermediate server, and you do not have to worry about the intermediate server running out of space or bandwidth. Also, because the gatekeeper server merely receives client requests and responds with signed URIs, it does not need to be very powerful.

Finally, this approach really highlights the power of S3’s URI-signing feature and demonstrates how it can be used in a nonobvious way, which makes it an interesting example in its own right. The fact that much of the work has already been done in an open-source toolkit means there is little reason not to try it out.

JetS3t Gatekeeper

The JetS3t project (http://www.jets3t.dev.java.net/) is an open-source suite of Java tools for working with S3 that includes an API implementation and a few applications. This application is based on version 0.6.0 of the JetS3t project. The applications we are most interested in are the Gatekeeper servlet, which acts as an authorization service to generate signed URIs for clients, and the Cockpit Lite application, which is an S3 client program that interacts with S3 using signed URIs received from the gatekeeper. Figure 4-2 shows the interaction between the Gatekeeper servlet and the Cockpit Lite client applications.

The gatekeeper mediating clients’ access to S3

Figure 4-2. The gatekeeper mediating clients’ access to S3

To begin, you must download the JetS3t distribution from the project’s web site and unzip it. You must also have Java version 1.4.2 or later installed on both the server where the gatekeeper will run and on client computers that will run the Cockpit Lite client application.

Deploy the Gatekeeper servlet

The Gatekeeper authorization application is a standard Java servlet. To run the servlet, you must first install a servlet container. In this example we will use the open-source Apache Tomcat servlet container version 5.5 (http://tomcat.apache.org/). To install Tomcat, download the core installation archive appropriate for your computer system from the web site and install or decompress it.

Note

In this example we will assume you are running the Linux operating system and installing all the software manually. If you are using Windows, you can use Tomcat’s setup.exe installer to do the hard work and take advantage of the extra graphical user interface (GUI) tools available on that platform.

Once the Tomcat core server is installed, start it up and confirm that you can visit the default Tomcat welcome page at http://localhost:8080/.

$ cd apache-tomcat-5.5.25/bin/
$./startup.sh

With Tomcat running, you can deploy the Gatekeeper servlet by copying the preprepared Gatekeeper web archive (WAR) file from the JetS3t distribution directly into Tomcat’s webapps directory. After a short delay, Tomcat should notice that the new file is present and will automatically decompress and run the servlet.

# Deploy the Gatekeeper WAR file to Tomcat's webapps directory
$ cp jets3t-0.6.0/servlets/gatekeeper/gatekeeper-0.6.0.war \
     apache-tomcat-5.5.25/webapps/

# Confirm that Tomcat has noticed the new servlet and started running it.
$ ls apache-tomcat-5.5.25/webapps/gatekeeper-0.6.0
META-INF 
WEB-INF

By default, the pre-prepared Gatekeeper servlet is configured to make testing easy by authorizing all client requests. We will tighten up the security settings after we have tested the servlet. Confirm that the gatekeeper is running by visiting the servlet’s URL, http://localhost:8080/gatekeeper-0.6.0/GatekeeperServlet in your web browser. You should see a brief welcome page stating that the servlet is running. If the servlet is not available, try stopping Tomcat and starting it again to allow it to recognize the servlet.

Note

The Gatekeeper servlet writes log messages to Tomcat’s default log files, especially logs/catalina.out. If you are experiencing problems, check this log file to see detailed debugging information.

Although the gatekeeper claims it is ready, you will have to configure it to know your AWS credentials and to tell it which bucket you will be making available through the servlet. Edit the gatekeeper’s web.xml configuration file stored in apache-tomcat-5.5.25/webapps/gatekeeper-0.6.0/WEB-INF/web.xml and set the appropriate values for the initialization parameters AwsAccessKey, AwsSecretKey, and S3BucketName. Here is the portion of the configuration file that contains the initialization parameters:

<init-param>
    <param-name>AwsAccessKey</param-name>
    <param-value>YOUR_AWS_ACCESS_KEY</param-value>
</init-param>
<init-param>
    <param-name>AwsSecretKey</param-name>
    <param-value>YOUR_AWS_SECRET_KEY</param-value>
</init-param>
<init-param>
    <param-name>S3BucketName</param-name>
    <param-value>YOUR_BUCKET_NAME</param-value>
</init-param>

Once this configuration is complete, the gatekeeper will be ready to respond to authentication requests made by the Cockpit Lite application. All of these requests will be allowed, and the users of Cockpit Lite will be able to do anything they wish, so do not make your gatekeeper publicly available until you read the authorization options in Authorization with HTTP Basic” later in this chapter.

Configure and test Cockpit Lite

Cockpit Lite is an application that allows users to interact with S3 without requiring access to the account holder’s AWS credentials. Whenever the user performs an operation in Cockpit Lite, the application asks the gatekeeper to approve the operation and issue it with a signed URI corresponding to the task the user wishes to perform. The application can be run as a stand-alone program, or it can be made available in a web page as a Java applet.

To test Cockpit Lite, we will start by running it in standalone mode to make it easier to manage. Once we have confirmed it can communicate with the Gatekeeper servlet we have just deployed, we will demonstrate how to make it available as an applet on a web site.

The first step in configuring Cockpit Lite is to ensure that it knows where to find the Gatekeeper servlet, so it can request authorizations. Edit the application’s configuration file configs/cockpitlite.properties and make sure the gatekeeperUrl property refers to the URI of the Gatekeeper servlet you deployed, such as http://localhost:8080/gatekeeper-0.6.0/GatekeeperServlet.

Run Cockpit Lite by invoking the startup script appropriate for your platform, either bin/cockpitlite.sh or bin/cockpitlite.bat. If all goes well, you will be presented with a graphical application for interacting with the S3 bucket you have made available via the gatekeeper. For more detailed instructions about using this application, please refer to the documentation available on the JetS3t web site.

If you intend to make the Cockpit Lite application available to other people, it will be much easier to direct them to a web page, rather than expecting them to obtain and install the JetS3t distribution. Fortunately the distribution includes pre–prepared applet versions of Cockpit Lite, among other applications, in the directory applets. To deploy a browser version of the application, you can simply copy this applet directory and its contents to Tomcat’s root folder.

$ cp -R jets3t-0.6.0/applets apache-tomcat-5.5.25/webapps/ROOT

You must configure the applet version of Cockpit Lite to know the Gatekeeper URL in the same way as you did in the standalone version. Do this by editing the webapps/ROOT/applets/cockpitlite.properties file you copied into Tomcat.

Launch the applet by loading the pre–prepared web page now available at http://localhost:8080/applets/jets3t-cockpitlite.html. Because the applet needs to be able to read and write files on your computer, you will be prompted to respond that you wish to trust it. Answer “yes,” and the application should start up in your browser ready for work.

Authorization with HTTP Basic

The system we have just set up is interesting but is not really an improvement on using public ACL settings on your S3 bucket. Because the Cockpit Lite client application is communicating with a default gatekeeper, anyone who runs the Cockpit Lite application is granted full access to the contents of your bucket. Now that we have the basics in place, it is time to look at how we can control third-party access to your S3 account.

At the simplest level, you can control who has access to your bucket by controlling who can access the Gatekeeper servlet. Because the gatekeeper is provided by a web server, you can use the authentication mechanisms offered by the server to require Cockpit Lite users to authenticate themselves before they can access the gatekeeper. This approach is relatively easy to implement and uses commonly available and well-understood techniques, but it results in an all-or-nothing situation in which every authorized user has full access to the bucket. If this is all the control you need, you can implement simple authentication by turning on HTTP Basic authorization.

To require Cockpit Lite users to provide login information to access the gatekeeper, we will activate HTTP Basic authorization for the servlet. While we are doing this, we will also define two distinct access roles, “gatekeeper” and “gatekeeper-admin,” for normal users and administrators. We will take advantage of these two access roles to provide custom role-based authorizations in a later example.

Let us configure Tomcat to create two login users who will belong to the normal and administrative access roles. Edit the Tomcat users’ XML file apache-tomcat-5.5.25/conf/tomcat-users.xml to include the two user elements defined below. You can set the username and password values to any values you like.

<?xml version='1.0' encoding='utf-8'?>
<tomcat-users>
  . . .
  <user roles="gatekeeper" username="user" password="secret"/>
  <user roles="gatekeeper-admin" username="admin" password="secret"/>
</tomcat-users>

Restart the Tomcat server to force it to re–read this file and recognize the new users.

$ cd apache-tomcat-5.5.25/bin/
$ ./shutdown.sh
$./startup.sh

Now that we have user accounts, we can configure the Gatekeeper servlet to refuse requests from users who cannot authenticate themselves. Edit the servlet’s configuration file apache-tomcat-5.5.25/webapps/gatekeeper-0.6.0/WEB-INF/web.xml to include the additional security and login configuration elements defined below.

<webapp
  . . . 
  <security-constraint>
    <display-name>Gatekeeper Authorization</display-name>
    
    <web-resource-collection>
      <web-resource-name>Protected Area</web-resource-name>
      <url-pattern>/*</url-pattern>
    </web-resource-collection>
    
    <auth-constraint>
      <role-name>gatekeeper</role-name>
      <role-name>gatekeeper-admin</role-name>
    </auth-constraint>
  </security-constraint>
  
  <login-config>
    <auth-method>BASIC</auth-method>
    <realm-name>Gatekeeper Authorization Required</realm-name>
  </login-config>
  
  <security-role>
    <role-name>gatekeeper</role-name>
    <role-name>gatekeeper-admin</role-name>
  </security-role>

</webapp>

Once you have made these changes and Tomcat has recognized them (you may have to restart the server before Tomcat will notice), you can test the login requirements by visiting the Gatekeeper servlet’s URL in your web browser. You should be prompted for a username and password and, unless you enter the login credentials you configured in Tomcat’s users’ file, you will not be able to view the status web page. The results will be similar when you run the Cockpit Lite application; you will be prompted to enter your credentials to allow the application to access the gatekeeper.

Customizable authorization modules

The JetS3t Gatekeeper servlet is intended to provide an extensible authorization framework that allows you to implement an S3 authorization service as powerful as you need through some configuration and Java coding. The servlet performs the request authorization process by calling on a number of code modules, each of which is responsible for a different part of the authorization process. By implementing your own modules and configuring the gatekeeper to use your version instead of the default one, you can take complete control over any aspect of this process.

The Gatekeeper servlet uses four replaceable modules to authorize client requests and return results. Each of these modules is implemented as a Java class that extends the following four interfaces:

BucketLister

Provides the client application with a list of the objects stored in an S3 bucket and informs it of the operations the user is allowed to perform. In the default implementation, the object listing includes the complete contents of an S3 bucket, though an alternative implementation may list only a subset of a bucket’s contents, depending on the identity of the client. The default implementation also tells the Cockpit Lite client that she can perform any S3 operation, but an alternative might allow only a restricted set of operations.

Authorizer

Allows or denies the client’s requests to perform S3 operations. Methods in this class are provided with information they can use to make authorization decisions, such as details about the requested operation and information about the client making the request, including her IP address and username, if server authorization is turned on.

The gatekeeper runs each authorization request through the Authorizer to determine whether it should be passed on to the UrlSigner module. The default Authorizer implementation allows all requests. Alternative implementations might perform one or more of the following advanced functions:

  • Perform user authorization by comparing a user’s login credentials or point of origin information against a user database or directory service, like Lightweight Directory Access Protocol (LDAP).

  • Allow fine-grained access control by organizing users into roles with differing permissions. Some users may only be allowed to perform a limited set of S3 operations, but others could have full privileges.

  • Evaluate authorization requests based on the specific S3 object being accessed. Some users may have restricted access to some portions of a bucket’s object hierarchy.

  • Evaluate file upload (PUT) operations based on properties of the file that will be uploaded. File uploads could be restricted based on the name, content type, or size of the file.

UrlSigner

Generates the signed URL strings that the Cockpit Lite application uses to interact with S3. Signed URLs can be created for the full set of S3 REST operations, including GET, PUT, HEAD, and DELETE. This module is invoked after the Authorizer module has authorized a request, so its only responsibility is to generate the signed URL.

The default implementation of this module can generate all the signed URLs necessary for Cockpit Lite to work, so it will be sufficient for most users. However, a customized version of this module could provide some very powerful features under the right circumstances.

A customized UrlSigner module could be used to re–map object names on the fly, presenting a different view of S3 resources to the user than is actually stored in the service. This feature is possible because the signed URLs generated by the gatekeeper need not correspond to the object structure shown in Cockpit Lite. By remapping object names, you could partition a single S3 bucket into a number of logical pieces using hierarchical object names. You could then share this bucket among many users, and these users can only see and access the portion of the object hierarchy that belongs to them.

TransactionIdProvider

This feature generates a transaction identifier that is meaningful for a specific application. This value may uniquely identify each authorization request message received from a client, or it may be used to group multiple request messages together into a single, logical transaction that shares a common identifier. The default implementation generates a random, globally unique identifier (GUID) for each authorization request. A custom implementation might obtain a more reliable value, such as a database sequence number, from an external system.

Implement a custom authorization module

It is not possible to provide example alternative implementations for all the gatekeeper modules in this book. To explore these more fully, you will have to refer to the JetS3t project documentation. We will stick to demonstrating the functionality most likely to be useful to many readers: a customized Authorizer module that gives different permissions to different users, depending on whether they belong to the normal user or administrative user access role. Our module will prevent nonadministrator users from deleting objects from S3 (Although users will not be able to delete objects, they will still be able to overwrite them with new files. A more sophisticated Authorizer module would disallow PUT operations that overwrite existing objects.).

To implement this customized behavior, we will need to do the following:

  • Define two Tomcat user accounts with the access roles “gatekeeper” and “gatekeeper-admin” (see Authorization with HTTP Basic” for instructions).

  • Write a custom Authorizer implementation class.

  • Configure the gatekeeper to use our Authorizer class instead of the default one.

  • Build and deploy our customized Gatekeeper servlet.

To build your own version of the Gatekeeper servlet, you will need to decompress the source files provided with the JetS3t project, and you will need to have Sun’s Java Development Kit (JDK) version 1.4 or later installed on your system. To take advantage of the scripts provided with JetS3t, you will also need to install the Apache ANT build tool, which may be found at http://ant.apache.org/.

With these requirements in place, you are ready to create a new implementation of the gatekeeper’s Authorizer module. Create a new file in the JetS3t project’s source code directory, src/org/jets3t/servlets/gatekeeper/impl/ExampleAuthorizer.java. Edit this file to contain the code listed in Example 4-11.

Example 4-11. Custom gatekeeper Authorizer implementation: ExampleAuthorizer.java

package org.jets3t.servlets.gatekeeper.impl;

import org.jets3t.servlets.gatekeeper.Authorizer;
import org.jets3t.service.utils.gatekeeper.GatekeeperMessage;
import org.jets3t.service.utils.gatekeeper.SignatureRequest;
import org.jets3t.servlets.gatekeeper.ClientInformation;

/**
 * Authorizer implementation to disallow DELETE requests from 
 * users not in the 'gatekeeper-admin' role. 
 */
public class ExampleAuthorizer extends Authorizer {

  /**
   * Default constructor - no configuration parameters are required. 
   */
  public ExampleAuthorizer(javax.servlet.ServletConfig servletConfig) 
      throws javax.servlet.ServletException 
  {
    super(servletConfig);
  }

  /**
   * Control which users can perform DELETE requests.
   */
  public boolean allowSignatureRequest(GatekeeperMessage requestMessage,
    ClientInformation clientInformation, SignatureRequest signatureRequest)
  {
    // Apply custom rules if this is a DELETE request.
    if (SignatureRequest.SIGNATURE_TYPE_DELETE.equals(
        signatureRequest.getSignatureType()))
    {            
      // Return true if the user is a member of the "gatekeeper-admin"
      // access role, false otherwise.
      return clientInformation.getHttpServletRequest()
          .isUserInRole("gatekeeper-admin");
    } else {
      // Requests for operations other than DELETE are always allowed.
      return true;            
    }
  }

  /**
   * Allow any user to obtain a listing of a bucket's contents.
   */
  public boolean allowBucketListingRequest(
    GatekeeperMessage requestMessage, ClientInformation clientInformation)
  {
    return true;
  }
  
}

This implementation code extends the Authorizer abstract class and implements two mandatory methods: allowSignatureRequest and allowBucketListingRequest. The only portion of this example code that does any real work is the allowSignatureRequest method. It checks whether a user has requested a DELETE operation and, if so, ensures that this user is a member of the access role called “gatekeeper-admin.” If a user is not a member of this access role, she will not be permitted to perform the delete operation.

To test this implementation, you must rebuild the Gatekeeper servlet application to include the implementation class, and you must configure it to use this implementation instead of the default one. To configure the Gatekeeper servlet to use an alternative Authorizer implementation class you must edit the servlet’s web.xml file. Rather than editing this file in the live Tomcat deployment, as we have previously, we will instead modify the predeployment file that is referenced by the ANT build scripts.

Copy the gatekeeper configuration file you edited previously in Authorization with HTTP Basic” to the JetS3t directory.

$ cp apache-tomcat-5.5.25/webapps/gatekeeper-0.6.0/WEB-INF/web.xml \
     jets3t-0.6.0/servlets/gatekeeper-web.xml

Edit the copied file jets3t-0.6.0/servlets/gatekeeper-web.xml to change the AuthorizerClass initialization parameter to refer to your new Authorizer implementation class, following the example below.

<init-param>
  <param-name>AuthorizerClass</param-name>
  <param-value>
    org.jets3t.servlets.gatekeeper.impl.ExampleAuthorizer
  </param-value>
</init-param>
. . .

Run the ANT build script included with JetS3t to build a new Gatekeeper WAR file that includes the new Authorizer implementation and your modified configuration file.

$ cd jets3t-0.6.0
$ant rebuild-gatekeeper

These commands will run the ANT build script build.xml to build the gatekeeper, and update the servlet WAR file with your changes. Copy the updated archive file servlets/gatekeeper/gatekeeper-0.6.0.war to Tomcat’s web application deployment directory.

$ cp jets3t-0.6.0/servlets/gatekeeper/gatekeeper-0.6.0.war \
     apache-tomcat-5.5.25/webapps/

If all goes well, the gatekeeper will reload and the new configuration will be applied. The next time you run Cockpit Lite, you will be unable to delete objects from S3 if you log in as the user who belongs in the “gatekeeper” access role. Log in as the user in the “gatekeeper-admin” access role instead, and you will once again be able to delete objects.

Next steps

In this chapter we have tried to keep the examples brief and simple to follow, but to make them so, we have avoided raising potential security or performance issues until now. If you intend to use the Gatekeeper servlet in a production system, or if you will expose it to the dangers of the Internet in any way, you will first have to secure it properly. The topic of securing web services is well beyond the scope of this book, but as a minimum you should consider taking the following steps:

  • Require all communication between Cockpit Lite and the Gatekeeper servlet to be transmitted using secure HTTPs (HTTPS) instead of standard HTTP to prevent anyone from snooping on the transmissions.

  • Ensure that Tomcat is configured with security in mind by disabling any unnecessary servlets that are installed by default.

  • Consider protecting the Tomcat server by making it accessible only through a more hardened web server, like Apache.

Get Programming Amazon Web Services now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.