There are many cool things that have emerged over the last few years that I’ve wanted to play with but not had a real reason to. One of these is Neo4j, a graph database; another tool to emerge from the “noSQL movement”. Well, when I was younger, I spent a LOT of time playing on MUDs (indeed, it was this that provided me with my first opportunity to write Real Code); for those who haven’t experienced them before, a common MUD family (ROM) has a movement system that consists of Rooms, discrete locations with a number of Exits to other rooms. Hey look ma, a graph!
Getting started with Neo4j was beautifully straight-forward. Just a trip to http://neo4j.org/download/, downloading and extracting the zip (1.4.1 at time of writing), then simply:
./bin/neo4j start
and up it came. From the README.txt I was pleased to note that Neo4j comes with a web admin interface - I’m a sucker for a shiny GUI that gives me pretty stats and graphs, and this certainly satisfies:
You can either embed Neo4j in your application or use its (very) RESTful API. If you're using Python, as I planned to, then it makes little difference to the code you write; there is an official set of bindings for the embedded case (neo4j.py), and a Github-hosted "Neo4j Python REST Client" which, incredibly usefully, offers a near-identical API. I opted for the REST client as that's my most likely future use case.
Leaving aside the awful awful code I had to write to parse the area format, my initial object structure was fairly simple:
class Area: def __init__(self): self.rooms = [] class Room: def __init__(self, vnum): self.vnum = vnum self.name = '<>' self.exits = [] class Exit: def __init__(self, target_vnum): self.target_vnum = target_vnum
For those unfamiliar with ROM MUDs (and if you are, I highly suggest popping along to The Mud Connector and giving one a try), each room has an id called a “vnum”; everything else above should be self explanatory. Rather than resolve an exit’s target_vnum into target_room while loading the area files, I opted to do this while importing into Neo4j.
Actually getting the above into Neo4j was beautifully simple. I tried to do so pseudo-“declaratively”, so multiple runs of the importer would have no net difference on the database, and adding more features just meant running it again without any deletion step or duplication.
First step was to create an index to be able to look up Room nodes by vnum. Helpfully, the documentation states that “If an index is created that already exists, the existing index will not be replaced, and the existing index will be returned”, so I could just create() away without any checking. Basic index use is pretty simple:
index = gdb.nodes.indexes.create('room_vnum_index') index['vnum']['%d' % room.vnum] = room_node room_node == index.get('vnum', '%d' % room.vnum)
Advanced queries are possible via embedded Lucene, but I don’t currently have a use case for them.
With that done, it was just a simple two-pass procedure for each area: create nodes for each room (if they didn’t already exist) and then for each exit of room, create an ‘exit’ relationship from the node for that room to the node with a vnum of target_vnum. So first up, node creation:
for room in area.rooms: node = None nodes = index.get('vnum', '%d' % room.vnum) if nodes: print 'Found node for vnum %d' % room.vnum node = nodes[0] else: print 'Node for vnum %d does not exist - creating' % room.vnum node = gdb.node(vnum=room.vnum) index['vnum']['%d' % room.vnum] = node node['name'] = room.name
Once that’s done, time for the exits:
for room in area.rooms: node = index.get('vnum', '%d' % room.vnum)[0] while node.relationships.outgoing(): node.relationships.outgoing()[0].delete() for exit in room.exits: found = index.get('vnum', '%d' % exit.target_vnum) if found: target_node = found[0] rel = node.relationships.create('exit', target_node)
And with the above shockingly small amount of code, I have my rooms and exits in Neo4j as nodes and relationships. Time to start playing with Traversals…!