(Note, if you haven’t read it already, I recommend my previous article on Django and Static Files to get an understanding of the fundamentals)
Pretty much every Django project I deploy, I use Amazon’s Simple Storage Service (S3) for hosting my static files. If you aren’t particularly familiar with it, then the salient points are:
I’ve been using Heroku a lot recently for deploying code. For those who’ve missed it, it’s a rather nice Platform as a Service (PaaS) offering; think Google App Engine, but actually usable: no long list of forbidden or broken frameworks and libraries, you can use a relational database, and other such niceties.
I’ve had a number of people ask in conversation why I use (or indeed trust) Heroku and don’t just deploy on top of AWS. Then I came across this question asking “Why do people use Heroku when AWS is present?“ on Stack Overflow, which I thought did a pretty good job of explicitly covering some of the aspects of the choice.
Maybe I should have flagged the question as Not Constructive (it’s now closed because others have), but instead found myself using it as a bit of a dumping ground for my thoughts on the matter; if you’re interested, check it out, I hope it’s informative.
It wasn’t very long before I ran into an unfortunate issue. Every time I tried to add or delete a DNS entry, I’d be presented with a shiny happy “Success!” message and sent back to the domain list. I’d then see no evidence of my change.
So sometimes, it emerges, that this is just because they lie: “Success” doesn’t mean “Success”, it means “I’m sorry, but we won’t let you point a CNAME to a CNAME” (not a wholly unreasonable position to take, but not required, and seemingly viewed as outdated). Sometimes, it just means they sort-of-lie: “Success” means “Success, but our UI won’t update until you’ve logged out and back in again”, so the UI will display stale data – less of a lie, about as irritating, especially as it’s indistinguishable from lying-Success.
This is perhaps slightly unfair, but approximately, after following the Example Usage instructions from the Eucalyptus site for a good little while, and re-following, and tweaking, and checking, I had no joy.
I then googled a bit harder and found HybridFox, which worked first time.
Perhaps I could have gotten ElasticFox working. HybridFox meant I didn’t have to try.
My first encounter with Ubuntu Enterprise Cloud was while throwing together a quick dev server at work. I booted from an Ubuntu server ISO and saw the words “Enterprise” and “Cloud” together; I promptly dismissed it as some form of bloaty-buzzwordy-junk. It turns out the word “Ubuntu” trumps the “Enterprise Cloud” bit, and it’s quite awesome.
When I want to make someone’s eyes glaze over, I describe it as an Open Source Software Infrastructure-as-a-Service Solution, or OSSIaaSS. When I actually want to talk sensibly about it, I describe it as “Amazon Web Services for your own hardware”. At its core, it’s a software platform called Eucalyptus which gives your own private IaaS setup. Crucially, it exposes this via the AWS API, so there’s a wealth of tools out there, and it makes later migration to real AWS easier.
(I should clarify; I’m doing AWS a huge disservice here; really “all” I’m talking about is EC2 and S3, but they’re my favourite and most-used subsystems, and I’d guess they’re the two things people think of first when someone mentions AWS)
Generally it seemed remarkably easy to set up. I did manage to make a few silly mistakes along the way, that if I’m honest took an embarrassing amount of time to identify despite being relatively obvious things:
Don’t forget to add a Node Controller
Node controllers run your VM instances, so you’ll want at least one. Obvious, right? Well, I managed to forget (multitasking at the time) and spent a little too long wondering why I couldn’t start any VMs.
$ sudo euca_conf --list-nodes registered nodes: 10.250.59.29 llama i-2B980609 i-38940671 i-3B250746 i-3D2B07A3 i-44F408EA i-4A7B08DD i-555C09E0 10.250.59.30 llama
$ euca-describe-availability-zones verbose AVAILABILITYZONE llama 10.250.59.211 AVAILABILITYZONE |- vm types free / max cpu ram disk AVAILABILITYZONE |- m1.small 0025 / 0032 1 192 2 AVAILABILITYZONE |- c1.medium 0025 / 0032 1 256 5 AVAILABILITYZONE |- m1.large 0012 / 0016 2 512 10 AVAILABILITYZONE |- m1.xlarge 0012 / 0016 2 1024 20 AVAILABILITYZONE |- c1.xlarge 0006 / 0008 4 2048 20
Don’t forget to enable virtualisation in the BIOS
Once I’d added my Node Controllers, I went to start some instances, only to watch them sit in “pending” mode before terminating. Once I found my way to the Node Controller logs, I was presented with:
libvirt: internal error no supported architecture for os type 'hvm' (code=1)
At this point, kvm-ok (for Ubuntu Lucid, in the qemu-kvm package) is your friend. The machines I was using as Node Controllers (Dell R710s) all have a “Virtualization Technology” setting in the BIOS (under “Processor Settings”). On all of our machines (and I gather this is standard) this was set to Disabled. Rebooting and editing the BIOS to enable it was all that was needed:
$ kvm-ok INFO: Your CPU supports KVM extensions INFO: /dev/kvm exists KVM acceleration can be used
As an aside, I fully support this “bug” entry about disassociating kvm-ok from kvm and putting a “You have virtualisation support but it is disabled” pseudo-warning into the motd – bring on the next Ubuntu LTS release!
Potentially important “footnote”
One bit of weirdness I did find was that immediately after installing (once I’d remembered to create a Node Controller) was that despite the Node Controller existing, and claiming to having detected the Cluster Controller at install time, the Cluster Controller couldn’t find it. Prodding euca_conf gave nothing in –list-nodes, and –discover-nodes found nothing. However I was sort of able to cajole things manually with –register-nodes; at least, keys were copied to the Node Controller, but not a lot of success beyond that.
I then discovered this thread on the Eucalyptus forum of a user having an essentially identical issue – no NC discovery, manual NC registration appearing to work but not, et cetera, with a follow-up post that a solution reported in another (slightly longer-winded) thread had fixed things.
To repeat for posterity and archiving / Google reasons, the solution, with my own notes, was:
- Deregister all Node Controllers – euca_conf –deregister-nodes)
- Deregister the Cluster – I used the WebUI for this; I don’t believe that using the CLI is necessary
- Restart the Cluster Controller – I’m afraid I forget as to whether or not I did this
- Register the cloud again – At this point, using the CLI is important
- Discover the Node Controllers – euca_conf –discover-nodes
For whatever reason, the Cluster created at install time, and subsequent ones via the WebUI / GUI, had some sort of issue. I haven’t yet been able to diagnose much, nor find a canonical bug report, but this seems a potentially rather significant issue that may hamper people!
The above issue aside, my current experiences with it have been great. Now to get boto talking to it!