Cloud computing is a funny business. The term itself is sometimes derided by technical professionals as meaningless. Yet, companies of all sizes, from the largest enterprises to small start-ups, are taking a serious look at it. Even the luddites are taking notice. So what is cloud computing? Why is it here? Perhaps most importantly, how much does it cost? These are all topics covered in this book. The focus is on how to utilize the cloud as a tool, not how to create and operate a private cloud.
At a high-level view, cloud computing provides a way for developers to focus exclusively on coding. It provides a way for systems engineers to finally offload some of the projects that have crept up over the years and continue to plague their time. With cloud technologies, architects can quickly prototype new technologies with minimum commitment and cost. These same technologies allow executives to better control and predict cost as well as remove much of the waste created by traditional computing. The differences between traditional computing and cloud computing are nuanced but many.
The rest of this chapter discusses the differences between the three primary cloud computing areas: Software as a Service (SaaS), Infrastructure as a Service (IaaS), and Platform as a Service (PaaS), pronounced “sass,” “pass,” and “i-a-a-s,” respectively. Keep in mind that the field is new and changing. Rapid innovation of this kind causes the lines between IaaS, PaaS, and SaaS to blur, sometimes significantly. It won’t always be obvious whether something is PaaS or IaaS but the difference between them isn’t nearly as important as understanding the concepts of how to properly utilize these technologies.
The idea of “Software as a Service” isn’t new, but the term SaaS is. SaaS simply refers to software that is provided on-demand for use. Traditionally, when someone wanted to use software they’d go to the store, pick up some disks, take them home, and install them on a computer. With SaaS, they just use hosted software. There’s no installation, no updates, no mess. There’s no magic to it. Anyone who has used web mail of any kind has been using SaaS.
SaaS has really come into maturity over the last decade. Some modern SaaS providers do a lot of fancy work behind the scenes to make things function properly. Compare what engineers had to do to run web mail in the late 90s to what the team at Google does to run Gmail today.
If one of those web mail admins from the 90s were brought forward in time and told to use Gmail, they’d be impressed for sure, but the basic workflow and usage would be very familiar to them. If that same sysadmin was told to start running Gmail, he’d likely be completely lost. That’s a common theme in cloud computing. Some aspects are so familiar, yet others are quite foreign when compared to traditional computing environments.
Infrastructure as a Service (IaaS) isn’t conceptually new. People have been collocating in data centers since data centers have been around. What is different with IaaS is the tooling behind it and where the lines of responsibility get drawn. Proper IaaS provides a mechanism for people to replace all of their data center hardware needs. Common IaaS services include:
Public and private network connectivity
Additionally, all of the dependencies for these services are also provided. This includes monitoring, power, cooling, repair, security, inventory tracking, and perhaps most importantly, people. Some IaaS providers even have convenient solutions to geographically diversify computing resources. All of this is provided at a cost that just isn’t possible with traditional computing. Typical rates for a host are pennies an hour.
In practical terms, this means that the time between when someone decides they need to host to when they actually log into it has been greatly reduced to a couple of minutes. Developers don’t have to put a large proposal together that includes servers, storage, network, rack space, installation, configuration, and so on. An entire proof of concept can be put together for the cost of a typical lunch. They don’t have to wait hours for sysadmins to provision a host. They don’t have to wait days for an order to get delivered and installed. Instead, with IaaS, they just need a little cash and a few minutes to pick what host they want.
The current front-runner for IaaS is Amazon Web Services (AWS), which is the brand Amazon.com has given to its cloud computing offerings. To give users an idea of the state of the art in IaaS, just look at AWS. The example below is a walk-through to create a new host via AWS’s Elastic Compute Cloud (EC2), which is the most common way users of AWS create hosts. This host is a virtual machine created on-demand with the parameters set via a wizard. Without getting too deep into IaaS, it is important to understand these core concepts, as almost all of cloud computing is based on an underlying IaaS layer.
This example illustrates the steps to create a new host. First, log in to https://aws.amazon.com/ (we’re assuming here an account has already been created).
Next, select the EC2 tab. This tab brings up the display showin in Figure 1-1.
The Launch button shown in Figure 1-1 starts the wizard to provision a new host. Notice in the lefthand navigation bar that there is a drop-down option for Region; in this example, US East (Virginia) is selected. This means the work done during this example will be in this region. Building hosts in Ireland or Tokyo is as simple as changing the drop-down.
As shown in Figure 1-2, select a Red Hat Enterprise Linux image called an Amazon Machine Image or AMI. In its simplest terms, an AMI is a set of default configurations for a virtual machine as well as the underlying operating system. An AMI is somewhat analogous to a physical machine that is powered off.
Next, select the virtual machine’s details, as shown in Figure 1-3. Notice the bullet in the wizard refers to this as an “instance detail.” AWS refers to virtual machines as instances. Pick a micro instance for this demo.
The screen shown in Figure 1-4 allows users to create key and value pairs called tags. A default key called “name” is provided to identify the instance. By default these tags have no impact on the actual running host. They’re typically used to provide an easier way to track and manage AWS resources. Common tags might include owner, environment, and so on. This instance is called Demo1.
The screen shown in Figure 1-5 is important, and is different then the key tags mentioned in the previous screen shot. The key selected in this portion is the private SSH key used to authenticate with the virtual machine. If a new key is created, this is the step in which it would be downloaded. These are SSH public/private key pairs and it is considered best practice in AWS to use them to access a newly created instance.
Once booted, users can choose whatever authentication mechanisms they want. The key simply gets downloaded from a special AWS website (From an instance it’s http://169.254.169.254/latest/meta-data/public-keys/0/openssh-key) and then placed in /root/.ssh/authorized_keys. There are lots of handy tricks available at that site for customization of newly booted instances. See the EC2 User Guide and forums for more information.
Figure 1-6 shows the Configure Firewall screen. This screen allows the user to select external firewall settings for the newly created host. These are not IPTABLES rules, but are instead enforced from outside of the actual host. AWS refers to a list of firewall rules as security groups. In this example a new security group is chosen, allowing port 22 (SSH) to all hosts.
The last screen seen before launching a new instance is a simple review of all of the settings picked during the wizard. The Launch button shown in Figure 1-7 tells AWS to actually create and boot this new instance.
After launching the virtual machine, the provisioning process can be monitored on the instances page seen in Figure 1-8. In this case, it took less then a minute. Clicking on the instance reveals instance details, in particular the Public DNS entry. This is the entry point used to gain access to our newly created virtual machine.
It’s highly recommended to always refer to the public DNS entry when possible. When creating custom DNS names for these hosts, try to use a CNAME. This causes requests to the host from outside of AWS to resolve to the external IP address. However, hosts internal to AWS use the internal AWS network. This private network looks like a class A (10.x.x.x) IP. All IaaS providers have some personality about things. Handy little tricks like this can be a life saver but only to those who read the docs and know them.
Using the key pair downloaded from the “Create Key Pair” step, logging in to the newly created host is easy and can be seen in Figure 1-9.
Using APIs and scripts, this entire process can be automated, allowing users to create multiple instances at once. Understanding the concepts behind IaaS better illustrates how products and platforms layered on top of IaaS behave.
Unlike IaaS and SaaS, PaaS is a much more abstract concept. Looking at cloud computing as an entire stack, PaaS would be in the middle of that stack. With IaaS at the bottom and SaaS at the top to interface with the end users and consumers. That’s not to imply all layers of the stack are required at once to consider yourself to be using cloud computing.
PaaS providers offer a platform for others to use. What is being provided is part operating system and part middleware. A proper PaaS provider takes care of everything needed to run some specific language or technology stack. Lets take a look at what it would take to provide a Python development interface and what that means for the PaaS user.
In order for Python code to be run, developers need the Python runtime and some sort of interface to expose the Python code while it’s running. Some PaaS providers that support Python do it via a WSGI interface. Apache, with the mod_wsgi module, is one such way to run Python applications. Developers or PaaS providers need a WSGI script somewhere so the code can be loaded and exposed via a web address.
To visualize this, think about Apache and Python at a public web address with some storage on the back end and maybe a load balancer seems pretty simple to do. Don’t forget though, that there’s a whole set of dependencies in order to get to that point. Apache needs some sort of operating system to function. It needs to be configured, maintained and monitored. In cloud computing, this OS runs inside virtual machines.
This gets into the details of the IaaS layer mentioned earlier. PaaS providers may do this themselves, or partner with an IaaS vendor to get it done. The important bit of information here is to know that everything from the running Apache/WSGI interface all the way down to the power and cooling in a data center, is someone else’s responsibility. It is not the responsibility of the PaaS consumer.
Many PaaS providers are providing muti-tenant solutions. This means that not only is the physical hardware shared among multiple virtual machines but the virtual machines themselves may have several different applications from several different customers on them.
PaaS today focuses almost entirely on web solutions. The components an end user interacts with are all web-based and because of this, most PaaS providers excel when it comes to large numbers of short lived process requests. PaaS providers have less polish when it comes to longer running, higher resource intensive jobs that cannot be broken down into smaller jobs. For example, a large batch processing job is likely best suited to be placed a level down at the IaaS layer because of the more fine controls over memory it provides. Scale out, not up, is becomming a common theme used throughout this book. It’s not a good or bad thing, but it’s a common architectural limitation. As PaaS matures, expect to see more offerings beyond web services.