Building a Development Environment with Docker

TL;DR

I've written a cheat sheet for Docker, and I have a github project for it. Here's the thinking that went into why Docker, and how best to use it.

The problem

You want to build your own development environment from scratch, and you want it to be as close to a production environment as possible.

Solutions

Development environments usually just… evolve. There's a bunch of tries at producing a consistent development environment, even between developers. Eventually, through trial and error, a common set of configuration files and install instructions turns into something that resembles a scaled down and testable version of the production environment, managed through version control and a set of bash scripts.

But even when it gets to that point, it's not over, because modern environments can involve dozens of different components, all with their own configuration, often times communicating with each other through TCP/IP or even worse, talking to a third party API like S3. To replicate the production environment, these lines of communication must be drawn – but they can't be squashed into one single machine. Something has to give.

Solution #1: Shared dev environment

The first solution is to set up a environment with exactly the same machines in the same way as production, only scaled down for development. Then, everyone uses it.

This works only if there is no conflict between developers, and resource use and contention is not a problem. Oh, and you don't want to swap out one of those components for a particular team.

If you need to access the environment from outside the office, you'll need a VPN. And if you're on a flaky network or on a plane, you're out of luck.

Solution #2: Virtual Machines

The second solution is to put as much of the environment as possible onto the developer's laptop.

Virtual Machines such as VirtualBox will allow you to create an isolated dev environment. You can package VMs into boxes with Vagrant, and create fresh VMs from template as needed. They each have their own IP address, and you can get them to share filesystems.

However, VMs are not small. You can chew up gigabytes very easily providing the OS and packages for each VM, and those VMs do not share CPU or memory when running together. If you have a complex environment, you will run into a point where you either run out of disk space or memory, or you break down and start packaging multiple components inside a single VM, producing an environment which may not reflect production and is far more fragile and prone to complexities.

Solution #3: Docker

Docker solves the isolation problem. Docker provides (consistent, reproducible, disposable) containers that make components appear to be running on different machines, while sharing CPU and memory underneath, and provides TCP/IP forwarding and filesystems that can be shared between containers.

So, here's how you build a development environment in Docker.

Docker Best Practices

Build from Dockerfile

The only sane way to put together a dev environment in Docker is to use raw Dockerfile and a private repository. Pull from the central docker registry only if you must, and keep everything local.

Chef recipes are slow

You might think to yourself, "self, I don't feel like reinventing the wheel. Let's just use chef recipes for everything."

The problem is that creating new containers is something that you'll do lots. Every time you create a container, seconds will count, and minutes will be totally unacceptable. It turns out that calling apt-get update is a great way to watch nothing happen for a while.

Use raw Dockerfile

Docker uses a versioned file system called AUFS, which identifies commands it can run from layers (aka cached fs) and pulls out the appropriate version. You want to keep the cache happy. You want to put all the mutable stuff at the very end of the Dockerfile, so you can leverage cache as much as possible. Chef recipes are a black box to Docker.

The way this breaks down is:

Cache wins.
Chef, ansible, etc, does not use cache.
Raw Dockerfile uses cache.
Raw Dockerfile wins.

There's another way to leverage Docker, and that's to use an image that doesn't start off from ubuntu or basebox. You can use your own base image.

The Basics

Install a internal docker registry

Install an internal registry (the fast way) and run it as a daemon:

docker run -name internal_registry -d -p 5000:5000 samalba/docker-registry

Alias server to localhost:

echo "127.0.0.1      internal_registry" >> /etc/hosts

Check internal_registry exists and is running on port 5000:

apt-get install -y curl
curl --get --verbose http://internal_registry:5000/v1/_ping

Install Shipyard

Shipyard is a web application that provides an easy to use interface for seeing what Docker is doing.

Open up a port in your Vagrantfile:

config.vm.network :forwarded_port, :host => 8005, :guest => 8005

Install shipyard from the central index:

SHIPYARD=$(docker run \
    -name shipyard \
	-p 8005:8000 \
	-d \
	shipyard/shipyard)

You will also need to replace /etc/init/docker.conf with the following:

description "Docker daemon"

start on filesystem and started lxc-net
stop on runlevel [!2345]

respawn

script
        /usr/bin/docker -d -H tcp://0.0.0.0:4243 -H unix:///var/run/docker.sock
end script

THen reboot the VM.

Once the server has rebooted and you've waited for a bit, you should have shipyard up. The credentials are "shipyard/admin".

Go to http://localhost:8005/hosts/ to see Shipyard's hosts.
In the vagrant VM, ifconfig eth0 and look for "inet addr:10.0.2.15" – enter the IP address.

Create base image

Create a Dockerfile with initialization code such as `apt-get update / apt-get install' etc: this is your base.
Build your base image, then push it to the internal registry with docker build -t internal_registry:5000/base .

Build from your base image

Build all of your other Dockerfile pull from "base" instead of ubuntu.

Keep playing around until you have your images working.

Push your images

Push all of your images into the internal registry.

Save off your registry

if you need to blow away your Vagrant or set someone else up, it's much faster to do it with all the images still intact:

docker export internal_registry > internal_registry.tar
gzip internal_registry.tar
mv internal_registry.tar.gz /vagrant

Tips

docker add blows away the cache, don't use it (bug, possibly fixed).
There's a limit to the number of layers you can have, pack your apt-get onto a single line.
Keep common instructions at the top of the Dockerfile to leverage the cache as long as possible.
Use tags when building (Always pass the -t option to docker build).
Never map the public port in a Dockerfile.

Exposing Services

If you are running a bunch of services in Docker and want to expose them through Virtualbox to the host OS, you need to do something like this in your Vagrant:

(49000..49900).each do |port|
  config.vm.network :forwarded_port, :host => port, :guest => port
end

Let's start up Redis:

docker pull johncosta/redis
docker run -p 6379 -d johncosta/redis

Then find the port:

docker ps
docker port <redis_container_id> 6379

Then connect to the 49xxx port that Virtualbox exposes.

Cleanup

docker ps -a | grep 'weeks ago' | awk '{print $1}' | xargs docker rm

eliminate:

docker rm `docker ps -a -q`

Running from an existing volume

docker run -i -t -volumes-from 5ad9f1d9d6dc mytag /bin/bash

Sources

Phusion Base Image
Dockerfile Best Practices
How Mailgun uses Docker (video is more complete than the blog post).
Docker Dev Environment in 24 Hours

docker