My first encounter with Ubuntu Enterprise Cloud was while throwing together a quick dev server at work. I booted from an Ubuntu server ISO and saw the words “Enterprise” and “Cloud” together; I promptly dismissed it as some form of bloaty-buzzwordy-junk. It turns out the word “Ubuntu” trumps the “Enterprise Cloud” bit, and it’s quite awesome.
When I want to make someone’s eyes glaze over, I describe it as an Open Source Software Infrastructure-as-a-Service Solution, or OSSIaaSS. When I actually want to talk sensibly about it, I describe it as “Amazon Web Services for your own hardware”. At its core, it’s a software platform called Eucalyptus which gives your own private IaaS setup. Crucially, it exposes this via the AWS API, so there’s a wealth of tools out there, and it makes later migration to real AWS easier.
(I should clarify; I’m doing AWS a huge disservice here; really “all” I’m talking about is EC2 and S3, but they’re my favourite and most-used subsystems, and I’d guess they’re the two things people think of first when someone mentions AWS)
Generally it seemed remarkably easy to set up. I did manage to make a few silly mistakes along the way, that if I’m honest took an embarrassing amount of time to identify despite being relatively obvious things:
Don't forget to add a Node Controller
Node controllers run your VM instances, so you’ll want at least one. Obvious, right? Well, I managed to forget (multitasking at the time) and spent a little too long wondering why I couldn’t start any VMs.
$ sudo euca_conf --list-nodes registered nodes: 10.250.59.29 llama i-2B980609 i-38940671 i-3B250746 i-3D2B07A3 i-44F408EA i-4A7B08DD i-555C09E0 10.250.59.30 llama
$ euca-describe-availability-zones verbose AVAILABILITYZONE llama 10.250.59.211 AVAILABILITYZONE |- vm types free / max cpu ram disk AVAILABILITYZONE |- m1.small 0025 / 0032 1 192 2 AVAILABILITYZONE |- c1.medium 0025 / 0032 1 256 5 AVAILABILITYZONE |- m1.large 0012 / 0016 2 512 10 AVAILABILITYZONE |- m1.xlarge 0012 / 0016 2 1024 20 AVAILABILITYZONE |- c1.xlarge 0006 / 0008 4 2048 20
Don't forget to enable virtualisation in the BIOS
Once I’d added my Node Controllers, I went to start some instances, only to watch them sit in “pending” mode before terminating. Once I found my way to the Node Controller logs, I was presented with:
libvirt: internal error no supported architecture for os type 'hvm' (code=1)
At this point, kvm-ok (for Ubuntu Lucid, in the qemu-kvm package) is your friend. The machines I was using as Node Controllers (Dell R710s) all have a “Virtualization Technology” setting in the BIOS (under “Processor Settings”). On all of our machines (and I gather this is standard) this was set to Disabled. Rebooting and editing the BIOS to enable it was all that was needed:
$ kvm-ok INFO: Your CPU supports KVM extensions INFO: /dev/kvm exists KVM acceleration can be used
As an aside, I fully support this “bug” entry about disassociating kvm-ok from kvm and putting a “You have virtualisation support but it is disabled” pseudo-warning into the motd - bring on the next Ubuntu LTS release!
Potentially important "footnote"
One bit of weirdness I did find was that immediately after installing (once I’d remembered to create a Node Controller) was that despite the Node Controller existing, and claiming to having detected the Cluster Controller at install time, the Cluster Controller couldn’t find it. Prodding euca_conf gave nothing in –list-nodes, and –discover-nodes found nothing. However I was sort of able to cajole things manually with –register-nodes; at least, keys were copied to the Node Controller, but not a lot of success beyond that.
I then discovered this thread on the Eucalyptus forum of a user having an essentially identical issue - no NC discovery, manual NC registration appearing to work but not, et cetera, with a follow-up post that a solution reported in another (slightly longer-winded) thread had fixed things.
To repeat for posterity and archiving / Google reasons, the solution, with my own notes, was:
- Deregister all Node Controllers - euca_conf --deregister-nodes)
- Deregister the Cluster - I used the WebUI for this; I don't believe that using the CLI is necessary
- Restart the Cluster Controller - I'm afraid I forget as to whether or not I did this
- Register the cloud again - At this point, using the CLI is important
- Discover the Node Controllers - euca_conf --discover-nodes
For whatever reason, the Cluster created at install time, and subsequent ones via the WebUI / GUI, had some sort of issue. I haven’t yet been able to diagnose much, nor find a canonical bug report, but this seems a potentially rather significant issue that may hamper people!
The above issue aside, my current experiences with it have been great. Now to get boto talking to it!