All day, every day, the Internet gets larger and larger, and recent estimates show simply phenomenal growth. Every minute, the Web is growing at a rate of over 500 new sites, while popular video hosting service YouTube is said to be receiving over 48 hours of new video content. Twitter users create over 100,000 new tweets, while Facebook users share over 600,000 new posts.
To the common observer, these numbers are amazing. However, to the developer, these numbers are simply incredible, not only because of the sheer size of the applications required to support this growth, but also the infrastructure in the form of servers and bandwidth required to keep these services ticking over happily 24/7.
No more than 10 years ago, the way a developer such as yourself would bring a new website or application onto the Web was to go out and purchase a new server, configure it, and find a data center that would host the server in a rack for you. This data center would provide your server with the life support that it needed to carry out its task: bandwidth, power, cooling, and the list goes on. Only once you had all these items lined up would users on the Web be able to type in your URL and find themselves looking at your website.
Now fast-forward a few years. Developers increasingly have the option of virtualization. Virtualization is the practice of renting a “virtual” server from a third party (be it another company or an internal IT team) and using that in place of your own physical machine. A virtual server would appear to the outside world to be just like a regular physical server; however, it only exists with the processes of a parent server, sharing resources and saving costs.
As a physical server is able to contain multiple virtual servers, an industry sprung up where companies would create data centers filled with machines that were sliced up and rented out to developers to use for their applications. This brought great gains for the developer: there was no longer a need to tie up capital in purchasing a physical server; there was no need to worry about maintaining the machine if things went wrong; and what’s more, it was available within hours rather than the days that it would have taken to procure and install a physical server.
The first decade of this century brought the concept of cloud computing. As demand for hosting grew and grew, and applications became larger and larger, a new form of hosting became apparent: cloud computing. Pioneered at a massive scale by Amazon’s Elastic Compute Cloud (EC2), cloud computing allowed developers to “spin up” virtual instances of servers on the Web and interact with them in the same way as normal machines. The key difference between these servers and traditional virtualization is that they are generally billed on a usage basis (usually by the hour), and at no point does the developer have any real idea of where the server is physically.
Cloud computing vendors such as Amazon do not rent out virtual servers; rather, they rent out a certain amount of computing capacity. It is up to the vendor where this capacity is provided from, and it is up to the vendor to provide and manage all the ancillary services that surround it.
Because the vendor is in complete control of the capacity and the ancillary services, new possibilities are available. For instance, suppose your website is mentioned on the Hacker News home page and you have 10 times the normal amount of traffic visiting your site. In the good old days of physical servers, your site would go down under the load and you’d be powerless to stop it, as you would be reliant on being able to purchase new hardware and configuring and installing it in your data center. With virtualization, you’d be better off: you’d be able to buy more servers from your vendor and have them up and running within a couple of hours, but until then your site would be down. With modern cloud-based hosts, you’d simply add more capacity to your application and instantly scale to meet the demand. What’s more, you wouldn’t be locked into this larger infrastructure, as you would with the other options—you simply scale up or down as necessary and only pay for the resources you have used.
Never used Heroku before? Jump on over to http://www.heroku.com and give it a spin. We won’t duplicate all of the site’s getting started material. Instead, we’ll focus on giving you in-depth insight into how Heroku works and how to get the most out of the platform.
If you’ve deployed to or shared VPS servers, there are some differences (which we’ll cover later), but here is a quick list to check out before getting started:
- Ephemeral file system
- You can write to and read from disk, but as soon as your server restarts—and it will—that’s all gone. Instead, use a shared file-storage system such as Amazon Simple Storage Service (Amazon S3). This also makes running on multiple machines easier.
- Shared state
- If you want to store session data on your server, you’ll need to find a way to persist it across multiple machines if you want to scale out. To do this, you can use secure cookies and a distributed store such as Memcached.
- Dependency management
- If you want to install external code libraries for your app, you’ll need to do it using a dependency management tool like Bundler for Ruby or Ivy for Java.
- Scale out, not up
- Heroku currently offers only one server size (called a dyno); if you need more horsepower, use more dynos. If one dyno isn’t big enough to get your app to run, you should consider splitting your app into smaller services, all talking over HTTP. It works for companies like Google, Facebook, and even Heroku—maybe it can work for you.
- Logs
Once you get your app on Heroku, you might need to debug your application code by looking at the logs. Because Heroku is different from a VPS or a shared host, you can’t SSH or FTP into your box to see your logs. Instead, use the Heroku command line interface to run the following:
$ heroku logs --tail
This saves you from having to SSH into multiple machines at the same time. See The Logplex for more information.
Unlike anything else you’ve probably used before, Heroku is a platform-as-a-service (PaaS) that has plenty of opinions on how you should run your code. Although these opinions may at first seem severe, you’ll get a flexible, scalable, fault-tolerant app that is a pleasure to run. If you don’t want to stick to the rules, your app will not be able to run on Heroku, and it probably won’t run well anywhere else.
So, at this point, if you haven’t already, it is probably worth quickly playing with Heroku so that you can get an application up and running. By doing this, you can get a quick overview on the deploy and build process, plus actually get an application out there on the Web in no time at all!
The first step, if you haven’t done this already, is to sign up for a Heroku account. Note that nothing we are going to do here will cost you anything, so you don’t need to worry about credit cards or charges for this exercise.
Once you’ve got an account, make sure you’ve got the Heroku Toolbelt installed. This toolbelt will make sure that you’ve installed everything necessary for getting an application up and running on Heroku.
Once installed, log in via your toolbelt at the command line:
$ heroku login Enter your Heroku credentials. Email: adam@example.com Password: Could not find an existing public key. Would you like to generate one? [Yn] Generating new SSH public key. Uploading ssh public key /Users/adam/.ssh/id_rsa.pub
For the purposes of this exercise, we’ll be deploying a sample application that we’ve already put together, the code for which can be found on GitHub.
To get started, we need to clone this code to our local machine:
$ git clone https://github.com/neilmiddleton/ruby-sample $ cd ruby-sample
Now that we’ve got the code, we can create an application to contain it:
$ heroku create Creating blazing-galaxy-997... done, stack is cedar http://blazing-galaxy-997.herokuapp.com/ | git@heroku.com:blazing-galaxy-997.git Git remote heroku added
This creates the application on Heroku ready and waiting for our code, and also attaches a git remote to our local codebase.
Now we can deploy:
$ git push heroku master Counting objects: 6, done. Delta compression using up to 4 threads. Compressing objects: 100% (5/5), done. Writing objects: 100% (6/6), 660 bytes, done. Total 6 (delta 0), reused 0 (delta 0) -----> Ruby/Rack app detected -----> Using Ruby version: ruby-2.0.0 -----> Installing dependencies using Bundler version 1.3.2 Running: bundle install --without development:test --path vendor/bundle --binstubs vendor/bundle/bin --deployment Fetching gem metadata from https://rubygems.org/.......... Fetching gem metadata from https://rubygems.org/.. Installing rack (1.2.2) Installing tilt (1.3) Installing sinatra (1.1.0) Using bundler (1.3.2) Your bundle is complete! It was installed into ./vendor/bundle Cleaning up the bundler cache. -----> Discovering process types Procfile declares types -> web Default types for Ruby/Rack -> console, rake -----> Compiled slug size: 25.1MB -----> Launching... done, v3 http://blazing-galaxy-997.herokuapp.com deployed to Heroku To git@heroku.com:blazing-galaxy-997.git * [new branch] master -> master
This has taken our code, pushed it to Heroku, identified it, and run a build process against it, making it ready for deployment.
Now our application is live on the Internet! To verify this, open it now:
$ heroku open Opening blazing-galaxy-997... done
Simple, huh?
Now that we’ve finished in the glory of our genius, read on to find out more about how Heroku works in the next chapter.
Get Heroku: Up and Running now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.