Monthly Archives: December 2013

Trying etcd on Android, Mac, and Raspberry Pi

phone, macbook, and raspberry pi

When you have a huge assortment of machines working together, you want them to be working independently as much as possible for the sake of performance, but you still have to coordinate them.  The smartest folks who have thought long and hard about how to do that wind up using a distributed state machine approach, so that there’s effectively a single virtual machine running on top of the whole cluster.  It would be easy if it weren’t for handling failures.  So you see Google and others using Paxos to implement the state machine.

I spent a couple years learning all about Paxos in the hopes that I’d figure out how to explain it to people easily.  Along the way, I did improve in my own understanding and even in my ability to talk about Paxos, but I noticed several failed attempts others had had at making Paxos accessible.

It turns out that Ongaro and Ousterhout of University of California, Berkeley, have created a more understandable implementation of a highly available distributed state machine.  It’s called “Raft”, and the paper is a good read.  Moreover, there’s a Go-language implementation of Raft.  Fun!

The distributed key-value store, etcd, is also written in Go and uses Raft.  I had been wanting to run Go software on my Android phone, and etcd seemed like an interesting way to try that out.

The first order of business was learning how to best get the etcd binary for arm onto the cell phone.  The options were:

  • Build it on my Raspberry Pi arm host running Fedora,
  • Build it on the phone (too awkward), or
  • Build it on my x86_64 Mac using the excellent cross-compilation stuff in Go.

I chose the latter route and used Dave Cheney’s helper software to make it easier.  After building the arm tools in Go 1.2 on my Mac, I could do,

GOOS=linux GOARCH=arm ./build

… inside my git clone of etcd to get a binary etcd program ready to push to my cell phone with adb.

I had to use root and put it in /data/local to be able to execute it.

After that, I could run etcd on my Mac (atala), Raspberry Pi (raspi), and my phone (doomslayer), like this:

ecashin@Ed-Cashins-MacBook-Pro ~$ rm -rf ~/opt/var/{etcd,atala} && ~/git/etcd/etcd -peer-addr 10.0.1.7:7001 -addr 10.0.1.7:4001 -data-dir ~/opt/var/etcd -name atala
[etcd] Dec 28 22:55:26.608 INFO      | etcd server [name atala, listen on :4001, advertised url http://10.0.1.7:4001]
[etcd] Dec 28 22:55:26.608 INFO      | raft server [name atala, listen on :7001, advertised url http://10.0.1.7:7001]
[etcd] Dec 28 22:56:16.540 INFO      | URLs: atala / atala (http://10.0.1.7:4001,http://10.0.1.80:4001,http://10.0.1.2:4001)

Note that no peers are specified and that I’m deleting the data directory to start the cluster from scratch. This is only safe because I know there’s no state to preserve from a previous instance of the cluster. Before starting etcd on Raspian, I had to flush the firewall rules with iptables - F.

[ecashin@raspi ~]$ rm -rf ~/opt/var/etcd && ~/git/etcd/etcd -peer-addr 10.0.1.80:7001 -addr 10.0.1.80:4001 -data-dir ~/opt/var/etcd -name rpi -peers 10.0.1.7:7001 
[etcd] Jun 23 04:25:27.115 INFO      | etcd server [name raspi, listen on :4001, advertised url http://10.0.1.80:4001]
[etcd] Jun 23 04:25:27.143 INFO      | raft server [name raspi, listen on :7001, advertised url http://10.0.1.80:7001]

I started raspi up specifying atala as its only peer.

localhost local # ./etcd-start.sh
+ d=/data/local
+ rm -rf /data/local/etcd.d
+ bind= -peer-bind-addr 10.0.1.2:7001 -bind-addr 10.0.1.2:4001
+ bind=
+ /data/local/etcd -peer-addr 10.0.1.2:7001 -addr 10.0.1.2:4001 -data-dir /data/local/etcd.d -name doomslayer -peers 10.0.1.7:7001,10.0.1.80:7001
[etcd] Dec 29 03:56:00.707 INFO      | etcd server [name doomslayer, listen on :4001, advertised url http://10.0.1.2:4001]
[etcd] Dec 29 03:56:00.757 INFO      | raft server [name doomslayer, listen on :7001, advertised url http://10.0.1.2:7001]

On doomslayer, the Samsung Galaxy S 3, I specified both other hosts as peers. I used a script for convenience.

localhost local # ./etcd-start.sh
+ d=/data/local
+ rm -rf /data/local/etcd.d
+ bind= -peer-bind-addr 10.0.1.2:7001 -bind-addr 10.0.1.2:4001
+ bind=
+ /data/local/etcd -peer-addr 10.0.1.2:7001 -addr 10.0.1.2:4001 -data-dir /data/local/etcd.d -name doomslayer -peers 10.0.1.7:7001,10.0.1.80:7001
[etcd] Dec 29 03:56:00.707 INFO      | etcd server [name doomslayer, listen on :4001, advertised url http://10.0.1.2:4001]
[etcd] Dec 29 03:56:00.757 INFO      | raft server [name doomslayer, listen on :7001, advertised url http://10.0.1.2:7001]

Then I verified that the hosts were in the cluster. Next I set and retrieved a key-value pair using different hosts as my server.

 ecashin@Ed-Cashins-MacBook-Pro ~$ echo `curl -L http://10.0.1.7:4001/v2/machines`
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    65  100    65    0     0  30922      0 --:--:-- --:--:-- --:--:-- 65000
http://10.0.1.7:4001, http://10.0.1.80:4001, http://10.0.1.2:4001
ecashin@Ed-Cashins-MacBook-Pro ~$ 

ecashin@Ed-Cashins-MacBook-Pro ~$ curl -L http://10.0.1.2:4001/v2/keys/message -X PUT -d value="Hello world"
{"action":"set","node":{"key":"/message","value":"Hello world","modifiedIndex":4,"createdIndex":4}}ecashin@Ed-Cashins-MacBook-Pro ~$

ecashin@Ed-Cashins-MacBook-Pro ~$ curl -L http://10.0.1.80:4001/v2/keys/message
{"action":"get","node":{"key":"/message","value":"Hello world","modifiedIndex":4,"createdIndex":4}}ecashin@Ed-Cashins-MacBook-Pro ~$

So now I think that instead of explaining Paxos, I’m going to tell folks about it but show them etcd!

Advertisements