Saturday, May 25, 2013

CloudStack's Architecture

(Warning: This post is purely technical and has nothing to do with Japan rather than being related to the project I'm going to do. For all non-Computer Science readers out there, I promise I'll post more "normal" posts later when I arrive in Japan, bear with me for the moment!)

Even before going to Japan, I have already starting discussing with our Japanese mentor, Ryousei Takano-san, via video conferencing about how we will implement our project. So in this post, I will share the relevant parts of CloudStack I've studied in case it will be useful for someone else in the future.

In a nutshell, CloudStack is a software for managing virtual machines across many physical hosts or even physical data centers! It can be separated into 2 parts - management and agent. Management is the part that directly interfaces with the users, allowing users to start, stop or manipulate virtual machines. On each machine that you want to use as a host for virtual machines, you must install the CloudStack agent, which is the layer between Management and the underlying hypervisor (virtualization software such as KVM, Xen or VMWare). One management server can manage multiple agents. In the simplest setup, the management server and the agent can be on the same machine.

CloudStack's Web Interface

CloudStack's user interface is web-based and is written entirely in HTML and JavaScript. If you download CloudStack's source code, you can find the user interface code in the ui folder. All communication to CloudStack is done via the API.

CloudStack API is a REST-based API accessible via HTTP and returns response in either XML or JSON. The API serves 2 purposes. First is that it allows external applications to interact with CloudStack and the other is to power the web interface as mentioned above. With this, you can be sure that the API is very complete and whatever can be done with the web interface can also be done with the API.

Next, let us see how the management server, once it receives a request via the API, communicates with the agents. In addition to the API mentioned before (called the Platform API), CloudStack also contains another internal API used for passing data between the management server and the agent. The internal API is JSON-based and uses GSON to convert between the Java objects (stored in core/src/com/cloud/agent/api) and JSON.

The last part of the command chain is how the agents communicate with the hypervisors. The code for this can be found in the plugins/hypervisors folder in the source code. Different techniques are used for different hypervisors. For KVM, libvirt's Java API, as well as shell scripts, are used to communicate with the hypervisor.

The flow of command from user to hypervisor

Finally, I'd like to briefly explain about the System VMs. In order to simplify installation of CloudStack on the host operating system, many functions are delegated to virtual machines running pre-built image for the specific purposes. The following System VMs are used by CloudStack.

  • Console Proxy: provides web-browser based access to the user VMs' screen, mouse and keyboard for initial installation or when SSH access is not working.
  • Secondary Storage: manages the VM templates and ISO images used for creation of new VMs.
  • Virtual Router: provides networking services to user VMs such as DHCP and DNS. 

Screen access provided by Console Proxy System VM

To recap, I've discussed about the management/agent separation, the flow of command from the user to the hypervisor and the various System VMs used by CloudStack. Thank you to everyone who stayed with me until the end! This is one of the most technical posts, if not the most technical, on this blog and future posts will be more interesting and suitable for readers of all ages and interests.

No comments:

Post a Comment