Saturday, May 25, 2013

CloudStack's Architecture

(Warning: This post is purely technical and has nothing to do with Japan rather than being related to the project I'm going to do. For all non-Computer Science readers out there, I promise I'll post more "normal" posts later when I arrive in Japan, bear with me for the moment!)

Even before going to Japan, I have already starting discussing with our Japanese mentor, Ryousei Takano-san, via video conferencing about how we will implement our project. So in this post, I will share the relevant parts of CloudStack I've studied in case it will be useful for someone else in the future.

In a nutshell, CloudStack is a software for managing virtual machines across many physical hosts or even physical data centers! It can be separated into 2 parts - management and agent. Management is the part that directly interfaces with the users, allowing users to start, stop or manipulate virtual machines. On each machine that you want to use as a host for virtual machines, you must install the CloudStack agent, which is the layer between Management and the underlying hypervisor (virtualization software such as KVM, Xen or VMWare). One management server can manage multiple agents. In the simplest setup, the management server and the agent can be on the same machine.

CloudStack's Web Interface

CloudStack's user interface is web-based and is written entirely in HTML and JavaScript. If you download CloudStack's source code, you can find the user interface code in the ui folder. All communication to CloudStack is done via the API.

CloudStack API is a REST-based API accessible via HTTP and returns response in either XML or JSON. The API serves 2 purposes. First is that it allows external applications to interact with CloudStack and the other is to power the web interface as mentioned above. With this, you can be sure that the API is very complete and whatever can be done with the web interface can also be done with the API.

Next, let us see how the management server, once it receives a request via the API, communicates with the agents. In addition to the API mentioned before (called the Platform API), CloudStack also contains another internal API used for passing data between the management server and the agent. The internal API is JSON-based and uses GSON to convert between the Java objects (stored in core/src/com/cloud/agent/api) and JSON.

The last part of the command chain is how the agents communicate with the hypervisors. The code for this can be found in the plugins/hypervisors folder in the source code. Different techniques are used for different hypervisors. For KVM, libvirt's Java API, as well as shell scripts, are used to communicate with the hypervisor.

The flow of command from user to hypervisor

Finally, I'd like to briefly explain about the System VMs. In order to simplify installation of CloudStack on the host operating system, many functions are delegated to virtual machines running pre-built image for the specific purposes. The following System VMs are used by CloudStack.

  • Console Proxy: provides web-browser based access to the user VMs' screen, mouse and keyboard for initial installation or when SSH access is not working.
  • Secondary Storage: manages the VM templates and ISO images used for creation of new VMs.
  • Virtual Router: provides networking services to user VMs such as DHCP and DNS. 

Screen access provided by Console Proxy System VM

To recap, I've discussed about the management/agent separation, the flow of command from the user to the hypervisor and the various System VMs used by CloudStack. Thank you to everyone who stayed with me until the end! This is one of the most technical posts, if not the most technical, on this blog and future posts will be more interesting and suitable for readers of all ages and interests.

Thursday, May 23, 2013

Where I'll be going and what I'll be doing

Hi everyone!

So this post will be like a little background on my internship in Japan, talking about my workplace and my roles.

First, let's start off with the workplace. I'll be going to the Grid Infraware Research Group at the Information Technology Research Institute (ITRI) of the National Institute of Advanced Industrial Science and Technology (AIST). AIST is a public research institute largely funded by the Japanese Government and has many campuses located throughout Japan, with the headquarters being in Tsukuba and Tokyo. The ITRI is located in the Tsukuba campus.

Tsukuba (つくば市) is about 50 kilometers north-east of Tokyo. It is very easy to travel from Tokyo to Tsukuba, thanks to the Tsukuba Express (TX) which only takes 50 - 55 minutes. However, from the Narita Airport, there is also bus service, which takes longer but can be more convenient if you have luggage with you.

Location of Tsukuba on Google Maps

Next, let's talk about what I'll be doing at AIST. The Grid Infraware Research Group, as the name suggests, deals with the lower layers of grid or cloud computing, also known as Infrastructure as a Service (IaaS). For example, researching the best way to assign virtual machines to users from a pool of computers, or how to best move virtual machines from one computer to another computer to reduce downtime. Ninja Migration is an example of a recent research done by the ITRI.

(Warning: Skip the following paragraph if you're afraid of technical details.)

The project I've been assigned to is to add PCI Passthrough support to CloudStack and the main goal is to be able to use InfiniBand on the virtual machines. I'm sure many of you are confused right now (like I was when I was first informed about the project), so I'll explain it step-by-step. CloudStack is an open-source software for managing virtual machines. Users can login to CloudStack and ask the system to provide them with a virtual machine they can use for whatever purpose they need it for. Now, the virtual machines usually connect to the outside world via a virtualized network card. However, these are (relatively) slow and not suited for High Performance Computing (HPC) where InfiniBand is very popular. To fully use InfiniBand, one need to allow the virtual machines to directly access the InfiniBand card, which can be done using PCI Passthrough. PCI Passthrough is already supported by Linux and KVM, the underlying hypervisor, and only the configuration and provisioning part needs to be added to CloudStack.

(End of technical part: everyone can come back now)

And that's it for this post. If there's anything you'd like to know, please feel free to leave a comment!


Wednesday, May 22, 2013

About this blog

Hi! My name is Pawit Pornkitprasan. As of 2013, I am a student at the Faculty of ICT, Mahidol University, Thailand. I am lucky enough to be offerred an internship at AIST in Japan during June - July and I have started this blog to share my experiences with my readers, whether it is about the work I am doing there or life in Japan.

As of this post, I am still in Thailand, but I'll be posting about my internship and my host institution in the mean time. I will arrive in Japan in June 3rd, and that's when the real fun will begin!

Please feel free to comment if you have suggestions, notice an error or just want to say hi.
よろしくおねがいします!