Creating a Reproducible and Portable Development Environment

Friday Mar 27th 2015

Discover a solution to overcome issues like nested virtualization and the need for bare metal machines. Learn to create Vagrant .box files for AWS.

By Michael Sverdlik at GigaSpaces

The Pain Point

Many developers often need to create easily reproducible development environments—for anything from testing to troubleshooting, and even continued development across teams. To this end, many technologies have arisen to answer this need from Vagrant and VirtualBox, and even Docker in certain contexts. However, with the onset of the cloud where many companies choose to do dev and test and QA work on resources on demand, this type of virtual development environment comes with its downsides too, namely issues such as nested virtualization.

In the context of our R&D and Ops work on Cloudify, an open source cloud orchestration tool written Python with a TOSCA-based YAML DSL, we often need to create reproducible and portable development environments on the cloud, and had to find a way to overcome these issues. We wanted to find the most seamless process to do this—that too would be easily replicable per environment. So what better way than to start with the most popular cloud: AWS? This article dives into one such scenario of porting Vagrant .box files for AWS, demonstrating how to overcome issues like nested virtualization, and the need for bare metal machines that are also costly and time-consuming to provision. You'll have a step-by-step tutorial for how to easily create v2v (virtual to virtual) machines and create a VMDK disk image that can then be uploaded to any AWS environment.

The Demo

For the quick trial of Cloudify, we provide a Vagrantfile and Vagrant box with Cloudify's Manager pre-installed on a VirtualBox image. By utilizing Vagrant and VirtualBox, we are able to provide our customers with a reproducible demo environment to evaluate Cloudify locally, on their personal computers.

We could have just provided OVF & VMDK files and have had users import these into VirtualBox; however, the point was to make the evaluation as simple as possible, and Vagrant strips away potential issues one might encounter when dealing directly with VirtualBox. So, instead of providing a detailed explanation on how to correctly set up a VirtualBox VM, we can summarize our quick start guide in two bullets:

  1. Download the Vagrantfile.
  2. Run 'vagrant up'.

Utilizing Packer to Create Vagrant Boxes

Creating a Vagrant box is a very straightforward matter. You can create one by using Vagrant itself or one of the many utilities available for performing this task.

Packer, one such option, is written by the same guys who wrote Vagrant and is a natural choice for this task. Packer works somewhat similarly to Vagrant; however, its focus is on producing images at the end of the process, and not running an environment, as Vagrant does. For VirtualBox images, Packer needs to create a virtual machine on VirtualBox, provision it, and export it at the end of the process as a box file.

Our objective seemed very simple: to achieve until the moment we realized that we can't run VirtualBox on our build machines.

The Nested Virtualization Problem

Here at GigaSpaces, most of our infrastructure is virtualized. Although this is great and allows us to better utilize our hardware, it does have some unavoidable "side effects". One of those side effects is the inability to run VirtualBox inside another VM because our hypervisor of choice doesn't allow nested virtualization.

Nested virtualization is a feature in virtualization solutions that allows you to run hypervisors inside a hypervisor. Essentially, it allows you to run a VM inside another VM. After poking around the web, it looks like none of the popular IaaS providers supports nested virtualization either.

Ideas for Solutions and New Problems

Possible ideas for a solution we had in mind:

    1. Use a bare metal box and avoid the problem:

      Even though this is the most obvious solution, it is also the least desirable from our point of view. We don't have any infrastructure in place to support bare metal provisioning. A dedicated bare metal server for a once-a-day image build is a phenomenal waste. Sharing a build machine with other builds is something we're even not going to consider because eventually things break due to collisions between builds, and less than perfect cleanups between one build and the one that comes after.

    2. Use specialized solutions that allow nested virtualization:
      • Using a hypervisor that supports nested virtualization (VMWare Workstation, for example)
      • Using solutions from service providers such as Ravello (which piggybacks on AWS)

These solutions provide additional cost (licensing or usage fees) and require us to adapt to unfamiliar APIs that could potentially break our current tool chain (Packer/Vagrant plugins for Ravello anyone?).

  1. Create a disk image without starting a VM in VirtualBox:

    We can provision a virtual machine, take its disk image, and convert it to VMDK. Sounds possible in theory; in practice, we had never done this before.

We decided to spend some time on researching alternative number three.

There are several ways to convert physical to virtual (p2v) or virtual to physical (v2p). In our case, what about doing v2v?

AWS allows you to export machines only if they were previously imported by you as well. As an alternative, we can provision a machine, make an image out of its hard drive, and convert it to a VMDK image. Then, all that is left is to add an OVF descriptor, bundle everything with Vagrant's metadata, and tar it into a .BOX file.

Tools of the Craft

  • Python as the scripting language with:
    • Fabric as the task executor over SSH
    • Boto as the API for AWS
  • Because we are a Python shop when it comes to Cloudify, this is a natural choice for us.
  • Packer - for Cloudify Manager provisioning: Packer is not a must in our case because we can script our way in to replicate what Packer does with Boto and Fabric. We'd rather use Packer due to the fact that it can replicate the process of creating an image on other providers.
  • AWS as the IaaS provider: Because we are comfortable with its API, our tools support it, and we can run it virtually everywhere without the need to have access to the office.

The Plan

Here is our step by step plan:

  1. Create a source image (AMI) with Cloudify pre-installed by using Packer.
  2. Launch a worker instance in AWS with the snapshot or source image as one of its disks.
  3. On the worker image: Create a raw image volume as a file and create an ext4 partition on it.
  4. Copy over the data from the source image disk to the previously created ext4 partition.
  5. Install the bootloader (extlinux) on the ext4 partition.
  6. Convert the raw image into a VMDK.
  7. Bundle the VMDK using an OVF descriptor and Vagrant metadata and create a tar file with the content and .box extension.
  8. Upload to S3.
  9. Clean up.

Here's a brief sequence diagram of the planned flow.

Figure 1: The planned flow

Steps 1-2: Creating Source Image

Following is a small Packer config snippet we're going to use for this task:

   "variables": {
      "aws_access_key": "{{env 'AWS_ACCESS_KEY_ID'}}",
      "aws_secret_key": "{{env 'AWS_ACCESS_KEY'}}",
      "aws_source_ami": "",
      "instance_type": "m3.large",
      "insecure_private_key": "./keys/insecure_private_key"
   "builders": [
         "name": "nightly_virtualbox_build",
         "type": "amazon-ebs",
         "access_key": "{{user 'aws_access_key'}}",
         "secret_key": "{{user 'aws_secret_key'}}",
            "{{user 'insecure_private_key'}}",
         "region": "eu-west-1",
         "source_ami": "{{user 'aws_source_ami'}}",
         "instance_type": "{{user 'instance_type'}}",
         "ssh_username": "vagrant",
         "user_data_file": "userdata/",
         "ami_name": "cloudify nightly {{timestamp}}"
   "provisioners": [
         "type": "shell",
         "script": "provision/"
         "type": "shell",
         "script": "provision/",
         "only": ["nightly_virtualbox_build"]
         "type": "shell",
         "script": "provision/",
         "only": ["nightly_virtualbox_build"]

Because this article is not about using Packer, I'm not going to spend time on this.

The general idea here is that Packer is going to launch an instance in AWS using the provided 'access_key' and 'secret_key'. It will launch the instance with a specific user data file that will create a 'vagrant' user that vagrant needs. After the instance is up and running, it will run three provisioning scripts that will install Cloudify, make some special adjustments to run this image on VirtualBox later on instead of EC2, and finally run a cleanup script to save some disk space.

Packer will be launched from our wrapper script and its output will be parsed until an AMI ID is found.

Step 2: Launch Worker Instance

Launching the worker instance is a straightforward task with Python and Boto:

import oss
import boto.ec2
from boto.ec2 import blockdevicemapping as bdm

# Open connection
access_key = os.environ.get('AWS_ACCESS_KEY_ID')
secret_key = os.environ.get('AWS_ACCESS_KEY')
conn = boto.ec2.connect_to_region(settings['region'],

# Run Packer and get source AMI ID
baked_ami_id = run_packer()
baked_ami = conn.get_image(baked_ami_id)

# Get snapshot id
baked_snap =

# Create mapping for factory machine
mapping = bdm.BlockDeviceMapping()
mapping['/dev/sda1'] = bdm.BlockDeviceType(size=10,
mapping['/dev/sdf'] =

# Create temp key pair
kp_name = random_generator()
kp = conn.create_key_pair(kp_name)

# Create temp security group
sg_name = random_generator()
sg = conn.create_security_group(sg_name,
   'vagrant nightly')

# Run worker instance
reserv = conn.run_instances(image_id=settings['factory_ami'],

factory_instance = reserv.instances[0]

Steps 3-6: Creating the VMDK Image

We'll use Fabric to execute commands over SSH on the worker instance.

First, we'll set up the environment (private key, timeouts, and connection attempts) and wait for the worker instance to enter a 'running' state:

env.key_filename = os.path.join(gettempdir(),
env.timeout = 10
env.connection_attempts = 12

while factory_instance.state != 'running':

Next, we'll use Fabric's 'execute' to launch remote commands on the worker instance:

execute(do_work, host='{}@{}'.format(settings['username'],

'do_work' is the heart of Steps 3 to 7. It's essentially a shell script that's being executed with Fabric's 'run()' and 'sudo()':

# Install needed utilities
sudo('apt-get update')
sudo('apt-get install -y virtualbox kpartx
   extlinux qemu-utils python-pip')
sudo('pip install awscli')

# Create mount point and mount source image
sudo('mkdir -p /mnt/image')
sudo('mount /dev/xvdf1 /mnt/image')

# Create file image, mount it and create FS
run('dd if=/dev/zero of=image.raw bs=1M count=5120')
sudo('losetup --find --show image.raw')
sudo('parted -s -a optimal /dev/loop0 mklabel msdos'
   ' -- mkpart primary ext4 1 -1')
sudo('parted -s /dev/loop0 set 1 boot on')
sudo('kpartx -av /dev/loop0')
sudo('mkfs.ext4 /dev/mapper/loop0p1')
sudo('mkdir -p /mnt/raw')
sudo('mount /dev/mapper/loop0p1 /mnt/raw')

# Copy over data from source image to new volume
sudo('cp -a /mnt/image/* /mnt/raw')

# Install bootloader (extlinux)
sudo('extlinux --install /mnt/raw/boot')
sudo('dd if=/usr/lib/syslinux/mbr.bin
      conv=notrunc bs=440 count=1 '
sudo('echo -e "DEFAULT cloudify\n'
   'LABEL cloudify\n'
   'LINUX /vmlinuz\n'
   'APPEND root=/dev/disk/by-uuid/'
   'sudo blkid -s UUID -o value
      /dev/mapper/loop0p1' ro\n'
   'INITRD  /initrd.img" | sudo -s tee

# Unmount
sudo('umount /mnt/raw')
sudo('kpartx -d /dev/loop0')
sudo('losetup --detach /dev/loop0')

Finally, we want to convert the raw image to VMDK format. This is possible by using the qemu-img utility:

# Convert to VMDK
run('qemu-img convert -f raw -O vmdk
   image.raw image.vmdk')

Step 7: Creating an OVF Descriptor and Bundling It into a .box File

Perhaps you noticed that in the beginning of 'do_work()' we installed 'virtualbox'. This is done so we could easily create an OVF descriptor instead of manually building the XML.

We're going to create a VM (without starting it), attach our VMDK file to it, set a few machine settings (such as CPU and memory size), and export it:

run('mkdir output')
# Create VM
run('VBoxManage createvm --name cloudify
   --ostype Ubuntu_64 --register')
# Create storage controller
run('VBoxManage storagectl cloudify '
   '--name SATA '
   '--add sata '
   '--sataportcount 1 '
   '--hostiocache on '
   '--bootable on')
# Attach volume to storage controller
run('VBoxManage storageattach cloudify '
   '--storagectl SATA '
   '--port 0 '
   '--type hdd '
   '--medium image.vmdk')
# Modify VM parameters
run('VBoxManage modifyvm cloudify '
   '--memory 2048 '
   '--cpus 2 '
   '--vram 12 '
   '--ioapic on '
   '--rtcuseutc on '
   '--pae off '
   '--boot1 disk '
   '--boot2 none '
   '--boot3 none '
   '--boot4 none ')
# Export VM
run('VBoxManage export cloudify
   --output output/box.ovf')

Now that we have OVF & VMDK files, all that's left is to create the initial Vagrantfile and Vagrant metadata file that will be packaged in the archive and tar everything:

run('echo " do |config|" > output/Vagrantfile')
run('echo " config.vm.base_mac = 'VBoxManage showvminfo cloudify '
   '--machinereadable | grep macaddress1 | cut -d"=" -f2'"'
   ' >> output/Vagrantfile')
run('echo -e "end\n\n" >> output/Vagrantfile')
run('echo \'include_vagrantfile = File.expand_path'
   '("../include/_Vagrantfile", __FILE__)\' >> output/Vagrantfile')
run('echo "load include_vagrantfile if File.exist?'
   '(include_vagrantfile)" >> output/Vagrantfile')
run('echo \'{ "provider": "virtualbox" }\' > output/metadata.json')
run('tar -cvf -C output/ .')

Step 8: Uploading to S3

Upload to S3 is done by using AWSCLI tools:

run('aws s3 cp '
   ' s3://{}/{}.box'.format(settings['aws_s3_bucket'],

Because we used an IAM role to initiate the worker instance, we do not provide any security credentials here. The IAM role provides these for us.

Step 9: Cleanup

Throughout the process we created different AWS resources. To clean them up after we're done (whether it was successful or not), we append each resource into the 'RESOURCES' array. When we finish, we run the 'cleanup()' function that tries to destroy each and every resource.

def main():
   # ...
   conn = boto.ec2.connect_to_region(settings['region'],
   # ...

def cleanup():
   for item in RESOURCES:
      if type(item) == boto.ec2.image.Image:
      elif type(item) == boto.ec2.instance.Instance:
         while item.state != 'terminated':
      elif type(item) == boto.ec2.connection.EC2Connection:
      elif (type(item) == boto.ec2.securitygroup.SecurityGroup or
            type(item) == boto.ec2.keypair.KeyPair):
         print('{} not cleared'.format(item))

Note: This is a very naive approach; it depends on the order of the items in the array to be successfully completed.

Final Result

We'll let the script speak for itself. You can find the full script in our Github repo and give it a test drive.

About the Author

Michael Sverdlik is a Senior Human Swiss Army Knife at GigaSpaces working on Cloudify. When he is not geeking around on his laptop or PS4, he likes to travel and see the world. More of a cat person. Hi, mom!

Mobile Site | Full Site
Copyright 2017 © QuinStreet Inc. All Rights Reserved