Dev Environments with Vagrant

If you work with a number of clients, one issue pops up over and over: setting up a new machine. Sometimes, you're lucky and a client will let you use your own machine. More often than not, though, you're forced to use their hardware. This usually involves reading a bunch of out-of-date wiki documents, asking people around you, and maybe contributing back to the wiki for the next person. If you're lucky, you'll get this done in a day or two. More typically, it can take a week or so.

If you're a manager, this should also worry you. You're making these developers, who you likely spent a good amount of money on recruiting and compensation for, spend a week or so of down time just setting up their computer. Even taking a conservative estimate of $65/hr, that means you're spending $2600 for somebody to get up and running. Now imagine you're paying prevailing market rate for consultants, and that figure rises dramatically.

At Dev9, we like to automate. Typical payback times for automation projects may be in the months or even years, but imagine you could shave 2-3 days off of new machine setup time for each developer you onboard. This kind of tool could pay for itself with your first new developer, with better returns for each additional developer. So, what do we do?

Code

This article is going to involve some code. If you want to play along at home, you can view our repo at https://github.com/dev9com/vagrant-dev-env.

Enter Vagrant

Vagrant is a tool perfectly designed for our use case. It utilizes virtual machines (I use Oracle VirtualBox). VMs used to be clunky, and kind of slow. But we're living in an age where laptops come with 16GB RAM and 500+GB SSD drives, along with 8-core processors. We are living in an age of abundance here, and it would be a shame to let it go to waste :).

The Build

What we are going to build is a development machine image. While companies can benefit from creating this and handing it to new hires, it's just as valuable if you have multiple clients. I can transition between provided hardware with ease, because I'm just using them all as a host for my VM. In addition, I can make a change to the provisioning of one VM, and propogate it quickly to the others.

This VM is going to be a headless VM. That means there is no UI. We will interact with it over SSH. This helps keep it fast and portable. I have no problem using IntelliJ IDEA on Windows or Mac or Linux, but what I always want is my terminal and build tools. So, that's the machine we're going to build.

Initial Setup

First, get Vagrant and VirtualBox installed. Maybe clone our git repo if you want to follow along. That should be all for now!

This is something that only comes with research, but our base image is going to be phusion/ubuntu-14.04-amd64. This is the foundation of all of our images. This one was chosen because it plays really nicely with Docker. Full disclosure, we are Docker's PNW partner, so this is actually important to me :).

Step 1: A Basic Box

The first step in anything software related seems to be hello world. So, to create a Vagrant instance, we create a Vagrantfile. Clever, right? And even better, your Vagrantfile is just Ruby code -- like a Rakefile. The simplest possible Vagrantfile for what we're doing:

box      = 'phusion/ubuntu-14.04-amd64'
version  = 2

Vagrant.configure(version) do |config|
    config.vm.box = box
end

Let's go through this. As I mentioned above, our base box is that Ubuntu distro. You can just as easily choose CentOS, SUSE, CoreOS, or any number of other images. People even have entire dev stacks as one image! The version identifier is just signalling to Vagrant which configuration API to use. I've personally never seen anything except 2, but given the concept of versioned APIs in the REST world, it's not difficult to see how they plan to use it in the future.

So, to run this, we just type vagrant up:

[10:50:48 /ws/dev9/vagrant-dev-env/step1]$ vagrant up
Bringing machine 'default' up with 'virtualbox' provider...
==> default: Importing base box 'phusion/ubuntu-14.04-amd64'...
==> default: Matching MAC address for NAT networking...
==> default: Checking if box 'phusion/ubuntu-14.04-amd64' is up to date...
==> default: Setting the name of the VM: step1_default_1409766665528_9289
==> default: Clearing any previously set forwarded ports...
==> default: Fixed port collision for 22 => 2222. Now on port 2200.
==> default: Clearing any previously set network interfaces...
==> default: Preparing network interfaces based on configuration...
    default: Adapter 1: nat
==> default: Forwarding ports...
    default: 22 => 2200 (adapter 1)
==> default: Booting VM...
==> default: Waiting for machine to boot. This may take a few minutes...
    default: SSH address: 127.0.0.1:2200
    default: SSH username: vagrant
    default: SSH auth method: private key
    default: Warning: Connection timeout. Retrying...
==> default: Machine booted and ready!
==> default: Checking for guest additions in VM...
==> default: Mounting shared folders...
    default: /vagrant => /ws/dev9/vagrant-dev-env/step1

[10:51:23 /ws/dev9/vagrant-dev-env/step1]$

Notice that this took all of about 35 seconds. Most of the output is rather self-explanatory. So, this box is "up" -- how do we use it?

[10:51:23 /ws/dev9/vagrant-dev-env/step1]$ vagrant ssh
Welcome to Ubuntu 14.04 LTS (GNU/Linux 3.13.0-24-generic x86_64)

 * Documentation:  https://help.ubuntu.com/
Last login: Tue Apr 22 19:47:09 2014 from 10.0.2.2
vagrant@ubuntu-14:~$

That's it. There's your Ubuntu VM! Let's say we want to take it down, delete it, and bring it back up:

vagrant@ubuntu-14:~$ exit
Connection to 127.0.0.1 closed.

[10:55:23 /ws/dev9/vagrant-dev-env/step1]$ vagrant destroy -f
==> default: Forcing shutdown of VM...
==> default: Destroying VM and associated drives...

[10:55:30 /ws/dev9/vagrant-dev-env/step1]$ vagrant up
Bringing machine 'default' up with 'virtualbox' provider...
==> default: Importing base box 'phusion/ubuntu-14.04-amd64'...
==> default: Matching MAC address for NAT networking...
==> default: Checking if box 'phusion/ubuntu-14.04-amd64' is up to date...
==> default: Setting the name of the VM: step1_default_1409766945197_31521
==> default: Clearing any previously set forwarded ports...
==> default: Fixed port collision for 22 => 2222. Now on port 2200.
==> default: Clearing any previously set network interfaces...
==> default: Preparing network interfaces based on configuration...
    default: Adapter 1: nat
==> default: Forwarding ports...
    default: 22 => 2200 (adapter 1)
==> default: Booting VM...
==> default: Waiting for machine to boot. This may take a few minutes...
    default: SSH address: 127.0.0.1:2200
    default: SSH username: vagrant
    default: SSH auth method: private key
    default: Warning: Connection timeout. Retrying...
==> default: Machine booted and ready!
==> default: Checking for guest additions in VM...
==> default: Mounting shared folders...
    default: /vagrant => /ws/dev9/vagrant-dev-env/step1

[10:56:02 /ws/dev9/vagrant-dev-env/step1]$

So under a minute to destroy a VM and bring up an identical one. Not bad, Future. Not bad. A box like this is fine and dandy, but we probably want to do more with it.

Step 2: Basic Provisioning

Even at a base level, let's say we want Java. So, let's tweak our Vagrantfile a bit:

box      = 'phusion/ubuntu-14.04-amd64'
version  = 2

Vagrant.configure(version) do |config|
    config.vm.box = box

    config.vm.provision :shell, :inline => "apt-get -qy update"
    config.vm.provision :shell, :inline => "apt-get -qy install openjdk-7-jdk"
end

If you now run vagrant up, you'll get a machine with Java installed:

[11:27:33 /ws/dev9/vagrant-dev-env/step2](git:master+?)
$ vagrant up
Bringing machine 'default' up with 'virtualbox' provider...
==> default: Importing base box 'phusion/ubuntu-14.04-amd64'...
==> default: Matching MAC address for NAT networking...
==> default: Checking if box 'phusion/ubuntu-14.04-amd64' is up to date...
==> default: Setting the name of the VM: step2_default_1409768866354_7342
==> default: Clearing any previously set forwarded ports...
==> default: Fixed port collision for 22 => 2222. Now on port 2201.
==> default: Clearing any previously set network interfaces...
==> default: Preparing network interfaces based on configuration...
    default: Adapter 1: nat
==> default: Forwarding ports...
    default: 22 => 2201 (adapter 1)
==> default: Booting VM...
==> default: Waiting for machine to boot. This may take a few minutes...
    default: SSH address: 127.0.0.1:2201
    default: SSH username: vagrant
    default: SSH auth method: private key
    default: Warning: Connection timeout. Retrying...
==> default: Machine booted and ready!
==> default: Checking for guest additions in VM...
==> default: Mounting shared folders...
    default: /vagrant => /ws/dev9/vagrant-dev-env/step2
==> default: Running provisioner: shell...
    default: Running: inline script

[ clipping a bunch of useless stuff -- you know how it is. ]

==> default: 1 upgraded, 182 newly installed, 0 to remove and 109 not upgraded.
==> default: Need to get 99.4 MB of archives.
==> default: After this operation, 281 MB of additional disk space will be used.
[ ... ]
==> default: done.
==> default: done.

[11:30:15 /ws/dev9/vagrant-dev-env/step2]$ vagrant ssh
Welcome to Ubuntu 14.04 LTS (GNU/Linux 3.13.0-24-generic x86_64)

 * Documentation:  https://help.ubuntu.com/
Last login: Tue Apr 22 19:47:09 2014 from 10.0.2.2

vagrant@ubuntu-14:~$ java -version
java version "1.7.0_65"
OpenJDK Runtime Environment (IcedTea 2.5.1) (7u65-2.5.1-4ubuntu1~0.14.04.2)
OpenJDK 64-Bit Server VM (build 24.65-b04, mixed mode)

vagrant@ubuntu-14:~$

And there we go. A scripted buildout of a base Ubuntu box with Java. Of course, shell scripts can and do go wrong. They get progressively more complex, especially as you start having components that mix and match. Additionally, since all developers should be getting familiar with Continuous Delivery concepts, let's take this opportunity to explore a little tool called Puppet

Step 3: Buildout with Puppet

Puppet is pretty awesome -- and so are Chef and Ansible. I chose Puppet initially because I could get it working quicker. I'm not making a value judgement on which one works best.

The idea with Puppet is that you use the puppet files to describe the state you want the machine to be in, and Puppet manages getting it there. Vagrant also has first-class support for Puppet. Remember above, how we're provisioning with inline shell scripts? Well, Vagrant also has a Puppet provisioner. If you've never used Puppet before, that's OK, the examples should give you a basic overview of its usage.

To set up a basic Puppet provisioner, let's do something like this in our Vagrantfile:

box      = 'phusion/ubuntu-14.04-amd64'

Vagrant.configure(2) do |config|
    config.vm.box = box

    # Now let puppet do its thing.
    config.vm.provision :puppet do |puppet|
      puppet.manifests_path = 'puppet/manifests'
      puppet.manifest_file = 'devenv.pp'
      puppet.module_path = 'puppet/modules'
      puppet.options = "--verbose"
    end
end

This also seems pretty straightforward. Again, don't worry too much if you don't know Puppet. Those paths are relative to the Vagrantfile, so your directory structure (initially) will look like this:

[12:43:47 /ws/dev9/vagrant-dev-env/step3]$ tree
.
├── Vagrantfile
└── puppet
    ├── manifests
    │   └── devenv.pp
    └── modules

In the provisioner, we're giving it 2 paths. Manifests is where puppet will look for manifest files. A manifest is a basic unit of execution in Puppet. A manifest is made up of one or more resource declarations -- the desired state of a resource. These resource declarations are the basic building blocks. So, to start, let's just get our previous example working in Puppet. Modify your devenv.pp to look like this:

group { 'puppet': ensure => 'present' }

exec { "apt-get update":
  command => "apt-get -yq update",
  path    => ["/bin","/sbin","/usr/bin","/usr/sbin"]
}

exec { "install java":
  command => "apt-get install -yq openjdk-7-jdk",
  require => Exec["apt-get update"],
  path    => ["/bin","/sbin","/usr/bin","/usr/sbin"]
}

This is pretty self explanatory, with one caveat: Order doesn't matter. Puppet tries to optimize the running and management of dependencies, so the steps will not necessarily be executed in the order you expect. This is why the require: declaration exists on the install java exec. We are telling Puppet to execute the apt-get update before this step. Notice also that it's a capital E in a require -- that's just the way Puppet does things. I'm sure somebody has a better explanation, but for now just consider it the required convention.

So, let's bring this box up:

[12:56:35 /ws/dev9/vagrant-dev-env/step3]$ vagrant up
Bringing machine 'default' up with 'virtualbox' provider...
==> default: Importing base box 'phusion/ubuntu-14.04-amd64'...
==> default: Matching MAC address for NAT networking...
==> default: Checking if box 'phusion/ubuntu-14.04-amd64' is up to date...
==> default: Setting the name of the VM: step3_default_1409774249245_48069
==> default: Clearing any previously set forwarded ports...
==> default: Fixed port collision for 22 => 2222. Now on port 2202.
==> default: Clearing any previously set network interfaces...
==> default: Preparing network interfaces based on configuration...
    default: Adapter 1: nat
==> default: Forwarding ports...
    default: 22 => 2202 (adapter 1)
==> default: Booting VM...
==> default: Waiting for machine to boot. This may take a few minutes...
    default: SSH address: 127.0.0.1:2202
    default: SSH username: vagrant
    default: SSH auth method: private key
    default: Warning: Connection timeout. Retrying...
==> default: Machine booted and ready!
==> default: Checking for guest additions in VM...
==> default: Mounting shared folders...
    default: /vagrant => /ws/dev9/vagrant-dev-env/step3
    default: /tmp/vagrant-puppet-3/manifests => /ws/dev9/vagrant-dev-env/step3/puppet/manifests
    default: /tmp/vagrant-puppet-3/modules-0 => /ws/dev9/vagrant-dev-env/step3/puppet/modules
==> default: Running provisioner: puppet...
==> default: Running Puppet with devenv.pp...
==> default: stdin: is not a tty
==> default: Notice: Compiled catalog for ubuntu-14.04-amd64-vbox in environment production in 0.07 seconds
==> default: Info: Applying configuration version '1409774267'
==> default: Notice: /Stage[main]/Main/Exec[apt-get update]/returns: executed successfully
==> default: Notice: /Stage[main]/Main/Exec[install java]/returns: executed successfully
==> default: Info: Creating state file /var/lib/puppet/state/state.yaml
==> default: Notice: Finished catalog run in 117.84 seconds

[12:59:48 /ws/dev9/vagrant-dev-env/step3](git:master+?)
$ vagrant ssh
Welcome to Ubuntu 14.04 LTS (GNU/Linux 3.13.0-24-generic x86_64)

Last login: Tue Apr 22 19:47:09 2014 from 10.0.2.2

vagrant@ubuntu-14:~$ java -version
java version "1.7.0_65"
OpenJDK Runtime Environment (IcedTea 2.5.1) (7u65-2.5.1-4ubuntu1~0.14.04.2)
OpenJDK 64-Bit Server VM (build 24.65-b04, mixed mode)
vagrant@ubuntu-14:~$

And now we have puppet provisioning our system! The output is also much nicer, and you can get some hint of how Puppet works -- there are stages, it gives us return values, it saves a state file, and there is a concept of environments. Any wonder why Puppet is so popular in the DevOps world? When you hear DevOps folks talking about a VM as a unit of deployment, they're not kidding. It's just a file.

Of course, this is basically cheating. The Puppet way is to describe the state of a system, and this is not describing the state of the system, it's describing commands to run. While some of you may like that, there are different frameworks for that. This is a declarative, stateful framework, so let's not try to turn it into glorified shell scripting. So, we can change that up a bit...

Part 4: Actually Using Puppet

For this step, the Vagrantfile doesn't change. We're just changing the Puppet files. Check this out:

group { 'puppet': ensure => 'present' }

exec { "apt-get update":
  command => "apt-get -yq update",
  path    => ["/bin","/sbin","/usr/bin","/usr/sbin"]
}

package { "openjdk-7-jdk":
  ensure  => installed,
  require => Exec["apt-get update"],
}

Now we're declaring state. We're just telling puppet to make sure openjdk-7-jdk is installed, and run an apt-get update beforehand. Since apt-get update is idempotent on its own, this whole definition is now idempotent. That means we can run it multiple times without issue!

Let's bring the box up:

[13:36:30 /ws/dev9/vagrant-dev-env/step4](git:master+!?)
$ vagrant up
Bringing machine 'default' up with 'virtualbox' provider...
==> default: Importing base box 'phusion/ubuntu-14.04-amd64'...
==> default: Matching MAC address for NAT networking...
==> default: Checking if box 'phusion/ubuntu-14.04-amd64' is up to date...
==> default: Setting the name of the VM: step4_default_1409776604916_69804
==> default: Clearing any previously set forwarded ports...
==> default: Fixed port collision for 22 => 2222. Now on port 2202.
==> default: Clearing any previously set network interfaces...
==> default: Preparing network interfaces based on configuration...
    default: Adapter 1: nat
==> default: Forwarding ports...
    default: 22 => 2202 (adapter 1)
==> default: Booting VM...
==> default: Waiting for machine to boot. This may take a few minutes...
    default: SSH address: 127.0.0.1:2202
    default: SSH username: vagrant
    default: SSH auth method: private key
    default: Warning: Connection timeout. Retrying...
==> default: Machine booted and ready!
==> default: Checking for guest additions in VM...
==> default: Mounting shared folders...
    default: /vagrant => /ws/dev9/vagrant-dev-env/step4
    default: /tmp/vagrant-puppet-3/manifests => /ws/dev9/vagrant-dev-env/step4/puppet/manifests
    default: /tmp/vagrant-puppet-3/modules-0 => /ws/dev9/vagrant-dev-env/step4/puppet/modules
==> default: Running provisioner: puppet...
==> default: Running Puppet with devenv.pp...
==> default: stdin: is not a tty
==> default: Notice: Compiled catalog for ubuntu-14.04-amd64-vbox in environment production in 0.17 seconds
==> default: Info: Applying configuration version '1409776705'
==> default: Notice: /Stage[main]/Main/Exec[apt-get update]/returns: executed successfully
==> default: Notice: /Stage[main]/Main/Package[openjdk-7-jdk]/ensure: ensure changed 'purged' to 'present'
==> default: Info: Creating state file /var/lib/puppet/state/state.yaml
==> default: Notice: Finished catalog run in 134.04 seconds

There we go! We've declared the state of our machine, and Puppet does its magic. Of course, Puppet can do a whole lot more -- file templating, adding and removing users, setting up configuration, making sure some packages are NOT present, etc. This is YOUR machine -- install git, maven, oh-my-zsh, etc.

Also, keep in mind that Puppet is a really in-demand skill. You might find yourself with a valuable new tool.