Evaluating hosting scenarios for our Ruby on Rails based web application, I finally had a good reason to dig deeper into “cloud computing”. While I was already watching the evolution of Amazon’s web services like S3 and EC2, a lot of offerings nowadays labelled as “cloud computing” were quite new to me. To better understand the range of offerings (and which ones might be suitable for hosting an application like ours), I tried to find a suitable categorization of cloud computing. What are the differences in all the offerings? What problems do they try to solve and how do they fit into my requirements for hosting our Ruby on Rails web application in a cloud?
Of course, I wasn’t the first one trying to categorize cloud computing. My initial research came up with quite a few examples: Demystifying Clouds, Cloud Computing Ectropy and yet another attempt to define cloud computing. I even found a dedicated research site.
Cloud Computing is a Very Hyped Topic
All these sites didn’t completely clear the cloudy image 😉 from my mind: Too many different approaches ranging from pure virtual servers like Amazon EC2 to complete applications in the cloud like Google Docs are hyped as “cloud computing”. My goal was finding a hosting service for our application to replace a bunch of self managed root servers. So I focused my categorization of cloud computing purely on one thing: Which service would best run our Ruby on Rails web application. For answering that question, I came up with five discrete levels of computing service which I mapped to existing cloud computing offerings.
Renting physical “root” servers, you don’t have to deal with the “plumbing” of your physical servers any more. Things like power, cooling, connectivity, physical access security, rack space, ordering and maintaining of servers, etc. the hosting company does for you. Operating the data center is already outsourced, but everything else has to be done by you. This is our current level of operations.
We have to deal with physical boxes: Make sure they can talk to each other by setting up routing between the boxes. And we use them as our physical infrastructure for running our application in virtual servers. That means that we have to build all our infrastructure for virtualization on our physical layer: Storage, connectivity, Xen enabled kernels and everything else required to run a Virtual Infrastructure (see below) on top of our physical servers. For example, we use disk images connected to loop devices stuffed into LVM volumes as storage for our Xen based virtual machines. And we use bridged routing over a VPN between the physical hosts to enable the Xen based virtual machines to talk to each other. All that has to be managed and to be kept running all the time.
To me, Virtual Infrastructure is everything concerned with dealing with your virtual servers. You need to start/stop/restart them. You need to migrate them to different physical hosts and you might want to even automatically scale up (a.k.a start new instances) under higher load. I came up with a set of capistrano tasks for managing our Virtual Infrastructure. I have one capistrano task for extending the storage for a virtual machine by adding a new disk image to the LVM volume. I have another capistrano task for migrating a virtual machine from one physical host to the other by stopping it, copying over all disk images and starting it again. Lacking real network storage, our setup is eventually destined to suffer from I/O issues. This makes me long for a solution which would help me deal with managing the Virtual Infrastructure e.g by providing me with shared storage, a VLAN for networking and some way of starting/stopping/restarting our virtual servers.
In our setup, every virtual server has a dedicated role: Web server, application server, memcached server, db server, etc. These dedicated virtual servers act as appliances for their specific tasks. Encapsulating every role like that enables us to e.g. switch from a single MySQL instance to a Master-Master Cluster without having to change anything on the other appliances. This greatly reduces the complexity and risk of upgrading any appliance. Our development team is also used to thinking about and working with appliances. They don’t care about the virtual or underlying physical infrastructure. And they care even less about the “plumbing” of the physical servers. All they have to deal with are web, app, mem and db appliances.
At the uppermost level in our production stack lives the application. Included in this level are basic tasks such as deploying new releases, making sure the required rubygems are available and ensuring the proper running of cron jobs. All these things only have to know about the configuration of the appliances: where do I find the database, from where to where shall I synchronize images, etc. This is our core business: Building a great application for our users! Everything else is necessary but not our main business focus.
Mapping Cloud Computing Offerings
Getting a very clear picture of the different levels of abstraction enabled me to categorize the offerings for cloud computing I came across. My mapping looks like this:
What are your experiences with hosting your applications “in the cloud”? Leave a comment or send a tweet to @webops.