Chapter 4. Hacking Elastic Beanstalk
The purpose of this chapter is to explore where Elastic Beanstalk ends, and where we can begin to adjust the system ourselves. We’ll start by delving into the way the Elastic Beanstalk instances integrate with the Elastic Beanstalk Service. If we understand this, we can slowly start to customize the image that runs our application (and the instances that are launched from it); we’ll change the logging, replace the OpenJDK with the Sun JDK, and replace Apache with Nginx. An interesting way to change an infrastructure is to take things out, which is exactly what we’ll do in the end: we’ll make the Elastic Load Balancer bypass Apache or Nginx altogether.
So, we understand how Elastic Beanstalk works and have sort of mastered the fundamentals. It is time to go a little bit further. Why not get our hands dirty and change those fundamentals? Chapter 2 introduced the concepts underlying the AWS services Elastic Beanstalk uses. There is not time to get into the details of working with every one of those things. If you want more details on how to create an AMI, for example, we suggest you read Programming Amazon EC2.
Building on top of Elastic Beanstalk, we can do all sorts of interesting things. Perhaps you want to use Nginx instead of Apache. Or you are contemplating just ignoring Apache for the Tomcat traffic. You might be using some features on the Sun JDK that you are used to, and they are not implemented (yet) in the OpenJDK.
Well, the good news is, we can do all these things. And, even better, the hostmanager (the part of the image that AWS added) is available under the Amazon Software License. You have no control over the actual hostmanager to Beanstalk communication, so doing these things comes at a cost. You will have to keep an eye on changes to this service, and maintain your custom images yourself.
The Instance
Elastic Beanstalk comes with default AMIs, for 32-bit and 64-bit instances. These images launch into instances that basically contain two things:
Everything related to running Tomcat (6 or 7)
The hostmanager, which is used to communicate with the Beanstalk environment
The hostmanager is a Ruby application. It takes care of starting and stopping necessary applications, handling deploys, and other Beanstalk-related tasks, like restarting the application servers. The Tomcat environment is a standard install, on top of the Open JDK.
If you know your way around Linux (CentOS/RH in particular), you can inspect the instances. If you launch a separate instance, you can make your changes and create a custom AMI. The difficulty is that you can’t easily deploy your WAR and test if your changes work.
We launched an environment with one instance, and made changes there so we could test this immediately. Sometimes when we broke the instance Beanstalk automatically replaced it. So you have to make a bit of haste. Once we were happy with the changes, we replayed the changes to the separate instance and created our AMI. Changing the environment configuration will show you if you are successful or not.
Note
It is a bit difficult to work with Elastic Beanstalk Instances like this. Another way to prevent accidental termination by an eager Elastic Beanstalk is to use Termination Protection. You can enable Termination Protection in the Console.
If you use Termination Protection, Elastic Beanstalk will still replace your instance if it does not show up healthy. But it can’t terminate it, so you don’t lose changes while working on it.
Logging
The logging (Tomcat logs) is verbose in the default images. If you
want to change this, you’ll have to create a custom image. If you want
to change the logging, you have to edit /opt/tomcat7/conf/logging.properties
. We
changed this file to this:
# ElasticBeanstalk Tomcat Logging handlers = 1monitor.java.util.logging.FileHandler, 2tail.java.util.logging.FileHandler # catalina.log for logrotate 1monitor.java.util.logging.FileHandler.level = WARNING 1monitor.java.util.logging.FileHandler.count = 1 1monitor.java.util.logging.FileHandler.pattern = ${catalina.base}/logs/monitor_ catalina.log 1monitor.java.util.logging.FileHandler.append = true 1monitor.java.util.logging.FileHandler.formatter=java.util.logging.XMLFormatter 2tail.java.util.logging.FileHandler.level = WARNING 2tail.java.util.logging.FileHandler.count = 1 2tail.java.util.logging.FileHandler.pattern = ${catalina.base}/logs/tail_catalina.log 2tail.java.util.logging.FileHandler.append = true 2tail.java.util.logging.FileHandler.formatter=java.util.logging.SimpleFormatter
Now, create the image and change the environment configuration.
Note
Since we are hanging around in /opt/tomcat7/conf anyway, you can see many other files. If you are familiar with Tomcat, you can find your way easily.
One thing we noticed is that there are only two AMIs, one 32-bit
and one 64-bit. If you want to use other instance types for your
particular app, you will definitely want to have a look at server.xml
, to change the number of threads,
for example.
Sun JDK
There are things we can’t see from outside the instances. We can’t see what the memory usage is, for example. By logging in, we can see the instances themselves are fine. The memory appears to be OK as well. But we have a very limited test suite, so it doesn’t tell us very much.
There is an interesting tool we often use in other Tomcat environments, and that is VisualVM. In theory, it would be pretty straightforward to enable this type of profiling information in Tomcat, but it is not that easy.
First, it is available in the Sun JDK, and experimental in the OpenJDK that powers Beanstalk. But we still wanted to give this a try. Launching a separate medium instance from the Beanstalk AMI gives us something to work on. And after some time we were able to replace the OpenJDK with the Sun JDK by doing the following:
$ rpm -e --nodeps java-1.6.0-openjdk.i686 $ wget -O jdk-6u25-linux-i586-rpm.bin \ http://download.oracle.com/otn-pub/java/jdk/6u25-b06/jdk-6u25-linux-i586-rpm.bin $ sh jdk-6u25-linux-i586-rpm.bin $ sed -i 's/\/usr\/lib\/jvm\(\/jre\)*/\/usr\/java\/jdk1.6.0_25/g' \ /etc/java/java.conf \ /etc/profile.d/aws-apitools-common.sh
Note
We posted this solution in the AWS forum, and one of our colleagues (dhavala, going by the name of Kris) came up with an alternative way of installing the Sun JDK:
Re: installing Sun JDK Posted by: Kris Posted on: May 30, 2011 3:45 AM in response to: truthtrap Reply When you remove OpenJDK using rpm -e --nodeps, you end up removing some symbolic links that are not created upon installing Sun JDK bin. Here are the commands for Tomcat, 64bit (similar to truthtrap's) cd ~ wget -O jdk-6u25-linux-x64-rpm.bin http://download.oracle.com/otn-pub/ java/jdk/6u25-b06/jdk-6u25-linux-x64-rpm.bin sudo chmod +x jdk-6u25-linux-i586-rpm.bin sudo ./jdk-6u25-linux-i586-rpm.bin sudo alternatives --install /usr/bin/java java /usr/java/default/bin/ java 20000 sudo update-alternatives --config java sudo ln -s /usr/java/default/jre /usr/lib/jvm/jre sudo ln -s /usr/share/java /usr/lib/jvm-exports/jre (Optional) While you are at it, install PSI-Probe in the Dev environments to monitor your JVMs. Just copy probe.war to /usr/share/tomcat6/webapps, start Tomcat using: /etc/init.d/tomcat6 start
We created the image, and it works perfectly. Now, supposedly, adding the following JVM Command-Line Options should make Tomcat ready for VisualVM style scrutiny, but we did not get this to work. This is not a “we leave this to the reader”; we have a book to finish. But if you know how to make this work, let us know:
-Dcom.sun.management.jmxremote=true \ -Dcom.sun.management.jmxremote.port=8086 \ -Dcom.sun.management.jmxremote.ssl=false \ -Dcom.sun.management.jmxremote.authenticate=false
Nginx
Apache is quite a beast. It does everything, basically, but at a cost. For heavy lifting (like our heystaq API calls), this is fine, but for other calls, the overhead of Apache is not always necessary. So, why not replace Apache with Nginx?
Replacing Apache with Nginx is a little bit more difficult than replacing the JDK. To do this we not only have to change the OS installation, but we also have to change the hostmanager.
We compiled most of the changes you have to make into this script:
#!/bin/sh # install Nginx yum -y install nginx sed -i 's/ 1;/ 4;/g' /etc/nginx/nginx.conf echo 'MAKE SURE TO REMOVE THE server ENTRY FROM /etc/nginx/nginx.conf' # add Nginx to the infamous Beanstalk hostmanager cd /opt/elasticbeanstalk/srv/hostmanager/lib/elasticbeanstalk/hostmanager cp utils/apacheutil.rb utils/nginxutil.rb sed -i 's/Apache/Nginx/g' utils/nginxutil.rb sed -i 's/apache/nginx/g' utils/nginxutil.rb sed -i 's/httpd/nginx/g' utils/nginxutil.rb cp init-tomcat.rb init-tomcat.rb.orig sed -i 's/Apache/Nginx/g' init-tomcat.rb sed -i 's/apache/nginx/g' init-tomcat.rb # create the right proxies (Beanstalk and hostmanager) echo 'proxy_redirect off; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; client_max_body_size 10m; client_body_buffer_size 128k; client_header_buffer_size 64k; proxy_connect_timeout 90; proxy_send_timeout 90; proxy_read_timeout 90; proxy_buffer_size 16k; proxy_buffers 32 16k; proxy_busy_buffers_size 64k;' > /etc/nginx/conf.d/proxy.conf echo 'server { listen 80; server_name _; access_log /var/log/httpd/elasticbeanstalk-access_log; error_log /var/log/httpd/elasticbeanstalk-error_log; #set the default location location / { proxy_pass http://127.0.0.1:8080/; } # make sure the hostmanager works location /_hostmanager/ { proxy_pass http://127.0.0.1:8999/; } }' > /etc/nginx/conf.d/beanstalk.conf
Make sure to remove the server
entry in /etc/nginx/nginx.conf
;
otherwise it won’t work.
The Infrastructure
Not everything happens on the instances, of course. We can also hack the infrastructure. In the previous section we replaced Apache with Nginx, for example. We can easily ignore Apache altogether, and tell the load balancer to connect its port 80 to the Tomcat ports on the instances.
There is one thing we need to do for that, and that is make sure the
Tomcat instances accept incoming connections on that port. Until a few
weeks ago, we had to open up the security group to the world, but Amazon
released a feature that allows us to open it up to a special security
group that each Elastic Load Balancer has. You can find this security
group in the Console, and it has a form like amazon-elb/amazon-elb-sg
, where amazon-elb
is the owner alias.
And now, you can change the ELB from the command line to connect port 80 to 8080 (for easy testing you can make two distinct connections, 80 to 80 and 8080 to 8080):
# first remove the old listener $ elb-delete-lb-listeners awseb-staging -lb-ports 80 # and then bypass the apache to point directly to 8080 elb-create-lb-listeners awseb-staging --listener "lb-port=80,instance-port=8080, protocol=http"
Note
In general it is best to hide as much of your instances as possible, but this particular feature just wasn’t there yet. We expect the Elastic Beanstalk product team to implement these features in updates.
We could have removed Apache altogether, but that would have broken
the hostmanager. This app runs on port 8999, but Beanstalk talks to
/_hostmanager
. Apache proxies this traffic as well. We
chose to ignore Apache, and leave it be.
Conclusion
This chapter shows that, with a few recipes, you can get inside Elastic Beanstalk and customize—or hack—the provided images to your needs. It’s not so straightforward, but it’s possible. The next thing would be to create your own AMIs for Beanstalk, but that goes beyond being a user to being a contributor to Beanstalk.
We hope that by reading this book you learned how to use Elastic Beanstalk in standard and more advanced ways. If you understand what is happening underneath the surface of your application, the potential of what you can do with Elastic Beanstalk and AWS is big!
Get Elastic Beanstalk now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.