Kristian Glass - Do I Smell Burning?

Django Settings Environments

So you’ve started your Django project. You’ve done some work on it, and all’s going well. You’re about to make your first deployment somewhere. You realise “Hey, I’m going to need different settings for each environment, what’s the best way to do that?”. It turns out people ask this a fair bit. I’ve read through a bunch of these, and none quite do it for me. So here's what I use - if it works for you, I'd love to hear it; if you think you can make it better, I'd love to know.

UPDATE

I now use the method described by The Twelve-Factor App and use django12factor instead. Stop reading this article and go read more about django12factor.

Splitting up the settings

First, some context. I want the side-effects of my changes to be minimal. I want to be able to easily tell which environment I’m in. I want switching environments to be painless. I want my changes to be no friction to someone else coming along to my code.

To that end, lets replace the settings module with a package:

$ ls -1 myproject/settings/
__init__.py
base.py
live.py
staging.py
local.py

We’ll migrate the contents of our original settings.py into settings/base.py, then have any configuration overrides for the live environment in settings/live.py (and the equivalent for the staging environment) and any specific local overrides in settings/local.py (we’ll also want to add this file to .gitignore or similar).

Now, the bit where the magic happens - settings/__init__.py:

import os
from base import *

ENVIRONMENT = os.getenv("DJANGO_ENVIRONMENT")

if ENVIRONMENT == "live":
    from live import *
elif ENVIRONMENT == "staging":
    from staging import *

try:
    from local import *
except ImportError:
    pass</pre>

Hopefully this should be fairly self-illustrative - in summary:

  1. Load in our core settings from base.py
  2. Grab the environment variable DJANGO_ENVIRONMENT and store it in settings.ENVIRONMENT
  3. Load any environment-specific settings overrides
  4. If we have local overrides, pull those in

With just that, you’re ready to go. No further modifications necessary.

Why do this?

So why is this better than, say, DJANGO_SETTINGS_MODULE or some random imports?

  • Environment settings are clean. Want to add another environment? Just create a file of that name, and start putting settings in
  • Easy to change environments for local developmen
  • I use Heroku, (their free tier is awesome for small initial deployments) - unless I want to start hacking buildpacks, this seems the cleanest way

The final issue is how environment-specific settings should be populated. If you haven’t read The Twelve-Factor App, go and do so. That said, I disagree slightly with some of their points about configuration - specifically, unsurprisingly, the dislike for environments. That said, I feel that the environments should be kept as minimal as possible, so as an example, here's my settings/live.py:

(UPDATE: Nope, they had a point about scaling, so now I use django12factor)

AWS_STORAGE_BUCKET_NAME = PROJECT_NAME
STATICFILES_STORAGE = 'storages.backends.s3boto.S3BotoStorage'
S3_URL = 'http://%s.s3.amazonaws.com/' % AWS_STORAGE_BUCKET_NAME
STATIC_URL = S3_URL
ADMIN_MEDIA_PREFIX = S3_URL + 'admin/'

This is nothing more than staticfiles configuration for storages using Amazon S3 and boto. You’ll note that I set nothing else in here - no DATABASES, or DEBUG, or AWS_ACCESS_KEY_ID et cetera - all of these things come from environment variables. Static files is the main thing for me that is a function of environment.

Warning to Heroku users

Any Heroku users reading the above and planning on implementing it, just be warned - the current Heroku Python buildpack has the neat feature of autogenerating a Procfile for you if it detects that you’re pushing a Django app. However, this detection is done by looking for a settings.py file, so if you implement what I describe above, you’ll need to create your own Procfile - this is no arduous task, you shouldn’t be using the autogenerated one anyway, but this threw me a bit when I encountered it!

Goodbye Easyspace...

I recently acquired control of a number of domains managed with Easyspace. While the user interface isn’t as clean as my current service of choice (Retrosnub), it seemed good enough.

It wasn’t very long before I ran into an unfortunate issue. Every time I tried to add or delete a DNS entry, I’d be presented with a shiny happy “Success!” message and sent back to the domain list. I’d then see no evidence of my change.

So sometimes, it emerges, that this is just because they lie: “Success” doesn’t mean “Success”, it means “I’m sorry, but we won’t let you point a CNAME to a CNAME” (not a wholly unreasonable position to take, but not required, and seemingly viewed as outdated). Sometimes, it just means they sort-of-lie: “Success” means “Success, but our UI won’t update until you’ve logged out and back in again”, so the UI will display stale data - less of a lie, about as irritating, especially as it’s indistinguishable from lying-Success.

Unsurprisingly, I got quite frustrated by this, and tweeted my feelings. I swiftly received a reply from the Easyspace Twitter account promising investigation and asking me to “reply to the DM with account credentials”. This set off some alarm bells in my head, but I assumed they just meant username. Some faff then ensued, as it seemed they weren’t quite clear on the workings of Twitter with respect to Direct Messages (specifically you can’t send one to someone not following you, and neither of us were following the other), until, to my consternation, I finally received their DM:

No, I am not emailing my password to socialmedia@yourdomain. I don’t really care if you are a legitimate representative of Easyspace, I’m not sending you my password, and for you to ask me to is pretty awful.

I tweeted my shock at this, and the ensuing micro-conversation was even more baffling:

Apparently they need my password “for DPA reasons”. Insane. By all means, “please go away and file a support request”, but asking for my password? I feel Tony sums it up perfectly.

Well, in December 2011, Amazon released a GUI for route53, their highly-available and scalable DNS service. Looks like an excellent candidate to migrate to - after the above, I have no intention of continuing dealing with Easyspace any further.

Where HAM I?

Maidenhead locator is an encoding of latitude and longitude commonly used in amateur radio, for which I recently passed my Foundation and Intermediate licenses, thanks to the teaching and kind assistance of Martin Atherton (G3ZAY) of the Cambridge University Wireless Society. The encoding itself is trivial, but still tedious to do by hand. So I built Where HAM I?, a simple web app using geolocation, Google Maps, and geocoding, to perform the necessary calculations and determine a user’s locator.

First, some background into coordinate systems. Latitude is the North / South positional coordinate, ranging from +90° at the North pole, and -90° at the South pole, with the centre being the Equator. Longitude is the East / West positional coordinate, ranging from +/- 180° centred at the Greenwich Meridian. The Maidenhead system offsets these to avoid negative numbers, then divides the globe into ‘fields’ of 10° latitude and 20° longitude. It is then just a matter of some simple arithmetic to construct the character pairs - the entire function is included below:

function maidenhead(_latitude, _longitude)
{
    var longitude = _longitude;
    longitude += 180;
    longitude /= 2;

    var latitude = _latitude;
    latitude += 90;

    function char_shift(start, offset)
    {
        return String.fromCharCode(start.charCodeAt() + offset);
    }

    var long1 = char_shift('A', Math.floor(longitude / 10));
    var lat1 = char_shift('A', Math.floor(latitude / 10));

    var long2 = char_shift('0', Math.floor(longitude % 10));
    var lat2 = char_shift('0', Math.floor(latitude % 10));

    var long3 = char_shift('a', Math.floor((longitude % 1) * 24));
    var lat3 = char_shift('a', Math.floor((latitude % 1) * 24));

    return {
        maidenhead: long1 + lat1 + long2 + lat2 + long3 + lat3,
        maidenhead_coordinates: {latitude: latitude, longitude: longitude},
        coordinates: {latitude: _latitude, longitude: _longitude},
    };
}

With the encoder function written, it was time to integrate HTML geolocation. It turns out this is beautifully simple:

if (navigator.geolocation) {
    navigator.geolocation.getCurrentPosition(successCallback, failureCallback);
} else {
    alert("Your browser does not support Geolocation");
}

In the above, successCallback is a function taking a Position object, and errorCallback a function taking a PositionError object. That’s it; the browser will then present a permission request to the user, and if all is well, we get a Position object containing latitude and longitude.

Now, the geolocation isn’t perfect. I haven’t looked into implementations, but I assume it’s very much like the iOS CoreLocation API; a nice interface behind which lies an implementation of “use every method we can and return the best data we get”, from GPS to mapping nearby wireless APs to the classic fallback of geoip. During development, my laptop (on the WLAN) located the house, but the desktop (on the wired network) gave a location of the town centre. Making this clear to the user is vital, hence the natural next step seemed to be using Google Maps to display the given location, so at least if it’s not correct, it should be obvious. The last time I looked at the Google Maps API was about four years ago - I remembered little more than “write some JavaScript” - it turned out to be very simple:

var latlng = new google.maps.LatLng(latitude, longitude);

var myOptions = {
    zoom: MAP_ZOOM,
    center: latlng,
    mapTypeId: google.maps.MapTypeId.ROADMAP,
};
map = new google.maps.Map(document.getElementById("map_canvas"), myOptions);

var marker = new google.maps.Marker({
    map: map,
    position: latlng,
    title: location.maidenhead,
});

Now it’s vital to show the user the location data provided in the event of it being inaccurate, but better would be to let the user manually provide more accurate data. Fortunately, Google provide the very nice Places Autocomplete service, which just attaches to a text input tag, and provides suggested locations in a drop-down. It’s also beautifully simple to use (in the code shown, show_location is my function to update the UI):

var geocoder_search_input = document.getElementById("geocoder_search_input");
var options = {types: ["geocode"]};

autocomplete = new google.maps.places.Autocomplete(geocoder_search_input, options);

google.maps.event.addListener(autocomplete, 'place_changed', function() {
    var place = autocomplete.getPlace();
    show_location(place.geometry.location.lat(), place.geometry.location.lng());
});

With that, I’ve covered Where HAM I?’s functionality and core components. Of course, nothing goes perfectly - two problems in particular had me somewhat flummoxed. First, I couldn’t seem to get Places Autocomplete working - all I got was the somewhat unhelpful message:

Uncaught TypeError: Cannot read property 'Autocomplete' of undefined"

Apparently this stemmed from the line

autocomplete = new google.maps.places.Autocomplete(geocoder_search_input, options);

Now why would google.maps.places be undefined? Well the Places service is a library that must be explicitly loaded via the libraries URL parameter when sourcing the Maps API. Spot the bug:

src="//maps.googleapis.com/maps/api/js?key=elided&sensor=false&v=3.7&libraries=places'"

If you noticed that extra ‘ before the “, well done. It took me painfully long.

The second major issue was a much weirder and more concerning bug - occasionally and inconsistently, the Google map would appear with some bizarre internal offset, making it impractical to use. A few searches led me to an answered StackOverflow question by someone who’d experienced the same issue - turns out that initialising the map with its container div hidden makes for general unhappiness. A quick tweak later, and success, a consistently working map.

Hosting-wise, it all lives nicely in three static files, one HTML, one CSS, one JavaScript (plus externally-hosted Twitter Bootstrap and jQuery, the CSS and JS frameworks I find myself using for absolutely everything nowadays) so deployment is a cinch. Remarkably simple and pain-free.

So, I hope this rough sketch of Where HAM I? proves a useful insight for some. As projects go, I feel it turned out rather nicely - a kind bunch over at Reddit provided some very useful feedback, and seemed to generally find it useful, and I hope others do too. All feedback welcome, and the source is available on GitHub for the interested.

Yahoo! Signup

I found myself wanting to create a flickr account today. This involved signing up for a Yahoo (sorry, Yahoo!) account. While not the most painful signup procedure I’ve encountered, I have a cold and it’s making me grumpy, and for a Big Company like Yahoo, I expect better, so they can be the target of today’s mini rant (I say that like they’re limited to once per day).

So I need to create my “Yahoo! ID and Email”, which seems to be username@ one of yahoo.com, ymail.com or rocketmail.com.

I do not want an email address thank you very much. I have plenty. I certainly don’t want one at some random domain about which I know nothing, but is presumably some historical acquisition that they are exposing for legacy reasons, the details of which I could care more about. I was about to delete that last sentence as it was pure unfounded conjecture, but it turns out it’s basically correct. Oh well, I’ll get over this.

I need to give them a postal code. Gosh. Are they going to send me nice shiny things? Are wise men going to turn up at my door bearing gifts? What’s that? No? Then go away.

Alternate email, ok, good good. Apparently it can’t be yahoo@mydomain.example.com though. Nor can it be yahoosucks@ or yahoosignupsucks@ - I thus guess anything containing “yahoo”. This is somewhat tedious, given that that’s my usual addressing strategy.

Now apparently two secret questions. Not just one, and they also have to be unique (not unreasonable given the premise, to be fair). They do at least allow me to provide my own secret question - not that that’s especially relevant, as I treat the answer as a secondary password that will alas inevitably be stored as plaintext. Still, as others have pointed out, the concept of the secret question is somewhat poor, and requiring me to generate a pseudopassword twice just adds an extra level of irritation.

I won’t get into the subject of why “First name” and “Last name” fields are bad (if you’re inclined to dismiss Wookey as an oddity, then consider the rest of the world). I’ll skip over the issue of binary gender, or even requiring gender information. Don’t get me wrong, I think those areas should be addressed too, but (alas) most other places don’t do well in those regards either, and that’s not the main target of this rant highly reasoned and informed discourse (ahem).

Now, to be fair, this isn’t the worst form I’ve encountered. No completely idiotic password requirements, after all. Seriously though, isn’t this a fairly solved problem? Go copy Google or something?

(Addendum: The post-signup process was mildly tedious too, but alas I clicked through these swiftly in frustration before realising I should probably take screenshots / notes for writeup, so either go endure the process yourself or take my word for it being annoying)

(Second Addendum: To follow this up, as I headed further through the flickr account creation process, I got to “Personalize your profile”, which, as a testament to the flickr / Yahoo! integration, asked for First Name, Last Name, Timezone, Gender, “Singleness” (Relationship status) and “Describe Yourself”. “Singleness” and Gender both had “Rather not say”, with Gender also having “Other”; all of this information was optional, but, “Singleness” aside, seemed sensible and reasonable to ask me to provide.

So, integration issues aside (though to be fair, as this was optional information, there’s a reasonable argument to be made for requiring it for account creation, but starting with a blank public profile) the flickr stage seemed substantially better, though profile creation is perhaps the apple to account creation’s orange.)

Augeas

Let me tell you a story about a call that changed my destiny tool that I find really useful.

To quote the website, “Augeas is a configuration editing tool. It parses configuration files in their native formats and transforms them into a tree. Configuration changes are made by manipulating this tree and saving it back into native config files”.

So you’re scripting some machine config and want to ensure some bits of config? Replace things like this:

echo "AllowTcpForwarding yes" >> /etc/ssh/sshd_config

or this:

sed -i 's/^AllowTcpForwarding.*no/AllowTcpForwarding yes/' /etc/sshd/sshd_config

with the much nicer:

augtool set /files/etc/ssh/sshd_config/AllowTcpForwarding yes

The first two examples both make some significant and potentially-invalid assumptions about the contents of sshd_config - yes you can write them more intelligently, grepping first etc., but why reinvent the wheel?

The quick tour documentation is rather nice, and also illustrates the matching ability for finding the “paths”. I was interested to note that the schemas (“lenses”) are written in “a very small subset of ML” - a language I haven’t really touched since I was an undergrad, but that appears to have come back to me with remarkable ease! Writing a lens looks mildly nontrivial, especially compared to Augeas’s general ease of use, but I’ve had no cause to actually do so yet, so this is based on limited data.

Bindings exist for Python, Ruby, Java, PHP and more, so there’s no need to hack around with os.system or similar!

In summary, a most excellent tool for any kind of config file editing - it’ll save you a lot of pain and is far more readable and maintainable. Enjoy.

(Inspired by some of the responses to a comment of mine on Reddit suggesting that this great tool could do with being a bit better known!)

Four Books I Think Everyone Should Read

A slight divergence from the usual topics, but this has kept coming up recently in conversation.

I read a reasonable amount. Not as much as I’d like to, but still. A fair few of the books I’ve read are ones I’d recommend to a lot of people. However these four are the ones I’d recommend to pretty much everyone, and I can’t recommend them enough. They also have the benefit of generally being fairly short - no excuses! I highly suspect that anyone reading this will probably have read most, if not all, of them, but feel the list worth compilation nonetheless.

In short, I can’t recommend these books enough. If you haven’t read them, do so. If you don’t own them, buy them. If you can’t buy them, rent them. I truly believe they are absolutely excellent.

In no particular order:

Dale Carnegie - How to Win Friends and Influence People

I nearly never read this due to the title - I felt it sounded so very “magic fix-your-life self-help”; I suspect this is a slight artefact of its age (first published 1936), though that does little to detract from the quality of the content. Granted, at times I started to get the occasional feeling that “For Your Own Gain” could quite readily be appended to the title, however this can’t have been more than once or twice; the primary tone seems to be much more one of facilitating a better and more productive environment for all.

To give the titles of some sections as an example, “Be a Leader: How to Change People Without Giving Offense or Arousing Resentment” and “Twelve Ways to Win People to Your Way of Thinking” - in summary, some excellent advice on the subject of people handling. The anecdotes definitely give the age of the book away somewhat, but purely in a very charming manner, while still remaining relevant.

We’ve all been guilty of mishandling people for whatever reason; this is a most excellent guide to not doing so again.

Francis Cornford - Microcosmographia Academica

The ‘tagline’ (better word suggestions sought!) for this book - “Being a guide for the young academic politician” sums it up, but doesn’t do it justice.

It claims to be about academic politics. It claims to be for those aged 25 to 30. Do not discount it if any of these does not apply. For anyone dealing with any kind of institutionalised decision-making, this will help you.

Picking suitable quotations is hard, because there’s so much to choose from. In describing a class of person:

The Non-placet differs in not being open to conviction; he is a man of principle. A principle is a rule of inaction, which states a valid general reason for not doing in any particular case what, to unprincipled instinct, would appear to be right. The Non-placet believes that it is always well to be on the Safe Side [...] The Young Man in a Hurry is a narrow-minded and ridiculously youthful prig, who is inexperienced enough to imagine that something might be done before very long, and even to suggest definite things.

These just cut to the quick for me; I’ve most definitely been both at various times. It is a piece of writing so beautifully cynical, witty and cutting, yet incredibly insightful and still resonant today (despite being first published in 1908). It’s hard to resist quoting more, be it regarding that bugbear that is Change:

The reports are referred by the Council to the Non-placets, and by the Non-placets to the wastepaper basket. This is called 'reforming the University from within.'

or methods by which people will hamstring an argument:

The third accepted means of obstruction is the Alternative Proposal. This is a form of Red Herring. As soon three or more alternatives are in the field, there is pretty sure to be a majority against any one of them, and nothing will be done.

If you have ever attended, or risk attending, anything resembling A Meeting, read this. If you ever feel implored to try to Get Something Done, read this.

Bonus: It’s available online for free at http://www.cs.kent.ac.uk/people/staff/iau/cornford/cornford.html

Steven Levitt and Stephen Dubner - Freakonomics

An economist applying his subject in interesting ways. This book is basically about incentives, and the consequences, from the surprising to the counter-intuitive, that result.

It’s written to sell; the topics covered are most definitely picked for their “shock” value - see “Which is more dangerous: a gun or a swimming pool?” and “Do police actually lower crime rates?” - but that doesn’t detract from the quality. It’s a fabulous challenge to commonly accepted thoughts, beliefs and opinions, and a great encouragement and guide to exploratory and critical thinking.

Ben Goldacre - Bad Science

Far too many otherwise-sensible people I know are readers of the Daily Mail. For anyone reading this not of the English persuasion, the DM is a “newspaper” of what could perhaps be described as “questionable” writing. For a shockingly realistic insight, see the “Daily Mail-o-matic” headline generator or the “Kill or cure?” project, which to use its own words, will “help to make sense of the Daily Mail’s ongoing effort to classify every inanimate object into those that cause cancer and those that prevent it”. That last one in particular is why I like to give this book to people.

In summary, the general quality of science coverage by the media is awful. So much of the population is being regularly mislead, through some combination of ignorance and malice. People are bad at noticing this, and asking the right questions, and spotting the crucial omissions. This book covers these, highlighting common misconceptions and misrepresentations, and dispelling a whole slew of untruths, from myths to outright lies.

It is an excellent introduction to critical thinking, covering topics that are so very close to home for so many. For anyone who’s ever taken in or repeated any bit of scientific or medical “information” from the mainstream media, read this.

First play with Neo4j

There are many cool things that have emerged over the last few years that I’ve wanted to play with but not had a real reason to. One of these is Neo4j, a graph database; another tool to emerge from the “noSQL movement”. Well, when I was younger, I spent a LOT of time playing on MUDs (indeed, it was this that provided me with my first opportunity to write Real Code); for those who haven’t experienced them before, a common MUD family (ROM) has a movement system that consists of Rooms, discrete locations with a number of Exits to other rooms. Hey look ma, a graph!

Getting started with Neo4j was beautifully straight-forward. Just a trip to http://neo4j.org/download/, downloading and extracting the zip (1.4.1 at time of writing), then simply:

./bin/neo4j start

and up it came. From the README.txt I was pleased to note that Neo4j comes with a web admin interface - I’m a sucker for a shiny GUI that gives me pretty stats and graphs, and this certainly satisfies:

Neo4j webadmin screenshotYou can either embed Neo4j in your application or use its (very) RESTful API. If you're using Python, as I planned to, then it makes little difference to the code you write; there is an official set of bindings for the embedded case (neo4j.py), and a Github-hosted "Neo4j Python REST Client" which, incredibly usefully, offers a near-identical API. I opted for the REST client as that's my most likely future use case.

Leaving aside the awful awful code I had to write to parse the area format, my initial object structure was fairly simple:

class Area:
    def __init__(self):
        self.rooms = []

class Room:
    def __init__(self, vnum):
        self.vnum = vnum
        self.name = '<>'
        self.exits = []

class Exit:
    def __init__(self, target_vnum):
        self.target_vnum = target_vnum

For those unfamiliar with ROM MUDs (and if you are, I highly suggest popping along to The Mud Connector and giving one a try), each room has an id called a “vnum”; everything else above should be self explanatory. Rather than resolve an exit’s target_vnum into target_room while loading the area files, I opted to do this while importing into Neo4j.

Actually getting the above into Neo4j was beautifully simple. I tried to do so pseudo-“declaratively”, so multiple runs of the importer would have no net difference on the database, and adding more features just meant running it again without any deletion step or duplication.

First step was to create an index to be able to look up Room nodes by vnum. Helpfully, the documentation states that “If an index is created that already exists, the existing index will not be replaced, and the existing index will be returned”, so I could just create() away without any checking. Basic index use is pretty simple:

index = gdb.nodes.indexes.create('room_vnum_index')
index['vnum']['%d' % room.vnum] = room_node
room_node == index.get('vnum', '%d' % room.vnum)

Advanced queries are possible via embedded Lucene, but I don’t currently have a use case for them.

With that done, it was just a simple two-pass procedure for each area: create nodes for each room (if they didn’t already exist) and then for each exit of room, create an ‘exit’ relationship from the node for that room to the node with a vnum of target_vnum. So first up, node creation:

for room in area.rooms:
        node = None
        nodes = index.get('vnum', '%d' % room.vnum)
        if nodes:
            print 'Found node for vnum %d' % room.vnum
            node = nodes[0]
        else:
            print 'Node for vnum %d does not exist - creating' % room.vnum
            node = gdb.node(vnum=room.vnum)
            index['vnum']['%d' % room.vnum] = node        node['name'] = room.name

Once that’s done, time for the exits:

for room in area.rooms:
        node = index.get('vnum', '%d' % room.vnum)[0]
        while node.relationships.outgoing():
            node.relationships.outgoing()[0].delete()
        for exit in room.exits:
            found = index.get('vnum', '%d' % exit.target_vnum)
            if found:
                target_node = found[0]
                rel = node.relationships.create('exit', target_node)

And with the above shockingly small amount of code, I have my rooms and exits in Neo4j as nodes and relationships. Time to start playing with Traversals…!

Lying DNS

$ ping git
PING git (81.200.64.50): 56 data bytes
$ host 81.200.64.50
50.64.200.81.in-addr.arpa domain name pointer advancedsearch.virginmedia.com.

I am so very grateful to my ISP for deliberately sending me incorrect data (I forgot to set up my DNS search domains on my new laptop) just so they can show a crappy web-search-assist page to people using browsers that don’t auto-Google things typed into the address bar that don’t “look like” addresses (thanks Chrome).

So very very grateful.

UPDATE: It was pointed out to me that one can turn this off via a link in the top right of the webpage but, currently, following that link eventually leads to you to a Tomcat 5.5.27 404 page. Great.

(Yes I know I can / should just use alternate DNS servers / fix my search domain / use DNSSEC etc.; my point that my ISP should not issue malicious lies still stands)

Ubuntu Enterprise Cloud and ElasticFox / HybridFox

To anyone looking to use ElasticFox to manage their Ubuntu Enterprise Cloud instance: Don’t. Use HybridFox instead.

This is perhaps slightly unfair, but approximately, after following the Example Usage instructions from the Eucalyptus site for a good little while, and re-following, and tweaking, and checking, I had no joy.

I then googled a bit harder and found HybridFox, which worked first time.

Perhaps I could have gotten ElasticFox working. HybridFox meant I didn’t have to try.

Ubuntu Enterprise Cloud

My first encounter with Ubuntu Enterprise Cloud was while throwing together a quick dev server at work. I booted from an Ubuntu server ISO and saw the words “Enterprise” and “Cloud” together; I promptly dismissed it as some form of bloaty-buzzwordy-junk. It turns out the word “Ubuntu” trumps the “Enterprise Cloud” bit, and it’s quite awesome.

When I want to make someone’s eyes glaze over, I describe it as an Open Source Software Infrastructure-as-a-Service Solution, or OSSIaaSS. When I actually want to talk sensibly about it, I describe it as “Amazon Web Services for your own hardware”. At its core, it’s a software platform called Eucalyptus which gives your own private IaaS setup. Crucially, it exposes this via the AWS API, so there’s a wealth of tools out there, and it makes later migration to real AWS easier.

(I should clarify; I’m doing AWS a huge disservice here; really “all” I’m talking about is EC2 and S3, but they’re my favourite and most-used subsystems, and I’d guess they’re the two things people think of first when someone mentions AWS)

Generally it seemed remarkably easy to set up. I did manage to make a few silly mistakes along the way, that if I’m honest took an embarrassing amount of time to identify despite being relatively obvious things:

Don't forget to add a Node Controller

Node controllers run your VM instances, so you’ll want at least one. Obvious, right? Well, I managed to forget (multitasking at the time) and spent a little too long wondering why I couldn’t start any VMs.

$ sudo euca_conf --list-nodes
registered nodes:
   10.250.59.29  llama   i-2B980609  i-38940671  i-3B250746  i-3D2B07A3  i-44F408EA  i-4A7B08DD  i-555C09E0
   10.250.59.30  llama
$ euca-describe-availability-zones verbose
AVAILABILITYZONE	llama	10.250.59.211
AVAILABILITYZONE	|- vm types	free / max   cpu   ram  disk
AVAILABILITYZONE	|- m1.small	0025 / 0032   1    192     2
AVAILABILITYZONE	|- c1.medium	0025 / 0032   1    256     5
AVAILABILITYZONE	|- m1.large	0012 / 0016   2    512    10
AVAILABILITYZONE	|- m1.xlarge	0012 / 0016   2   1024    20
AVAILABILITYZONE	|- c1.xlarge	0006 / 0008   4   2048    20
If you see nothing under --list-nodes and all the "max" numbers for your availability zones are 0, you've probably failed to add a node controller (yes this was slightly embarrassing).

Don't forget to enable virtualisation in the BIOS

Once I’d added my Node Controllers, I went to start some instances, only to watch them sit in “pending” mode before terminating. Once I found my way to the Node Controller logs, I was presented with:

libvirt: internal error no supported architecture for os type 'hvm' (code=1)

At this point, kvm-ok (for Ubuntu Lucid, in the qemu-kvm package) is your friend. The machines I was using as Node Controllers (Dell R710s) all have a “Virtualization Technology” setting in the BIOS (under “Processor Settings”). On all of our machines (and I gather this is standard) this was set to Disabled. Rebooting and editing the BIOS to enable it was all that was needed:

$ kvm-ok
INFO: Your CPU supports KVM extensions
INFO: /dev/kvm exists
KVM acceleration can be used

As an aside, I fully support this “bug” entry about disassociating kvm-ok from kvm and putting a “You have virtualisation support but it is disabled” pseudo-warning into the motd - bring on the next Ubuntu LTS release!

Potentially important "footnote"

One bit of weirdness I did find was that immediately after installing (once I’d remembered to create a Node Controller) was that despite the Node Controller existing, and claiming to having detected the Cluster Controller at install time, the Cluster Controller couldn’t find it. Prodding euca_conf gave nothing in –list-nodes, and –discover-nodes found nothing. However I was sort of able to cajole things manually with –register-nodes; at least, keys were copied to the Node Controller, but not a lot of success beyond that.

I then discovered this thread on the Eucalyptus forum of a user having an essentially identical issue - no NC discovery, manual NC registration appearing to work but not, et cetera, with a follow-up post that a solution reported in another (slightly longer-winded) thread had fixed things.

To repeat for posterity and archiving / Google reasons, the solution, with my own notes, was:

  • Deregister all Node Controllers - euca_conf --deregister-nodes)
  • Deregister the Cluster - I used the WebUI for this; I don't believe that using the CLI is necessary
  • Restart the Cluster Controller - I'm afraid I forget as to whether or not I did this
  • Register the cloud again - At this point, using the CLI is important
  • Discover the Node Controllers - euca_conf --discover-nodes

For whatever reason, the Cluster created at install time, and subsequent ones via the WebUI / GUI, had some sort of issue. I haven’t yet been able to diagnose much, nor find a canonical bug report, but this seems a potentially rather significant issue that may hamper people!

The above issue aside, my current experiences with it have been great. Now to get boto talking to it!