Dmitry Leskov
 

Massively Cloning A VirtualBox VM – The Smart Way

This is a second post of a two-part series. The first part covered the creation of a VirtualBox VM with a baseline Ubuntu Server installation for the purpose of quickly setting up a series of identical VMs, e.g. to compare the performance and memory footprint characteristics of different Web stacks. Contributing to the speed of VM setup is smart cloning, which is the subject of this second post.

The Smart Way

Note that this process is optimized for the creation of multiple, nearly identical VM clones, two or more of which may then run side-by-side on a single host. If you need to clone a VM just once, or perhaps create a series of restore points without relying on VirtualBox snapshots, the official way may be the way to go, though you may still wish to clone the virtual hard disk image separately.

File names/locations and commands are valid as of Ubuntu 10.04, but may be different on other guest operating systems.

To prepare a VM for cloning, do the following (click the links to learn more about the non-obvious steps):

  1. Remove persistent network interface mappings
  2. (optional) Clean up free disk space
  3. Shutdown the VM
  4. Detach the hard disk image from the VM
  5. Compress the detached image
  6. (optional) Remove any private shared folders that get mounted automatically
  7. (optional) Place the cloning instructions in the VM description for future reference
  8. Export the VM

You will end up with an OVF file defining the VM and an archive containing its virtual hard disk image. To create a clone of such a pair, do the following:

  1. Initiate extraction of the virtual hard disk image from the archive
  2. In the meantime, import the VM from the respective OVF file, renaming it as needed
  3. Generate a new MAC address for each network adapter
  4. (optional) Define the automatically mounted private shared folders, if any
  5. Wait until virtual hard disk image extraction completes, change its UUID, and attach to the new VM
  6. Start the new VM
  7. (optional) Change any static IP addresses that the original VM had
  8. Change the host name
  9. Verify that network is working, if not, edit network interface mappings
  10. PROFIT!!!

You may now skip to the explanations, or read about

The Official Way

VirtualBox supports VM cloning via appliance export/import since version 2.2.

To create a "master" appliance containing one or more VMs that you want to clone, select File/Export Appliance (or press Ctrl-E), choose the VM(s) you want to export, optionally add/edit their meta-information, and select the desired name and location of the OVF (Open Virtualization Format) file that will contain VM descriptors. Replicas of all virtual hard disks attached to the exported VMs will be placed alongside that file, so make sure there is enough free space at the target location.

To create a clone of a previously exported VM, simply import the master appliance on the same host: select File/Import Appliance (or press Ctrl-I), choose the master OVF file containing that VM, and, optionally, choose a new name for each VM.

The Smart Way Explained

Virtual Hard Disk Image Cloning

One problem with the official method above is that a cloned disk image always has type VMDK and is set to dynamically expand, even if the original disk image was a fixed-size one. You also may not rename it during import, so if the original disk has name "Baseline.vdi", the imported disk will have name "Baseline.vmdk", regardless of whether you rename the VM during import.

To circumvent this limitation, you may temporarily detach virtual hard disk images before exporting the respective VMs and clone them separately using the VirtualBox command-line interface, VBoxManage:

VBoxManage clonehd original-image clone-image [ options  ] 

Notes:

  1. Important: original-image may be either a UUID of a registered disk image or a name of an actual disk image file, whether registered or not. In the latter case, you must either specify the full pathname of original-image, or unregister it before cloning, otherwise you will get a misleading error message.
  2. If you do not provide any pathname to clone-image, it will be created in the default location, not the current directory.
  3. By default, clone-image will be set to expand dynamically. The option --variant Fixed forces fixed-size.
  4. If you want the clone image to be in a different format, use the option --format
  5. The --remember option registers the clone with VirtualBox, so it will appear in the Virtual Media Manager.

For example, my virtual hard disk images are stored in V:\HardDisks, so a clone command might look like this:

cd /d V:\HardDisks
VBoxManage clonehd V:\HardDisks\BaseLine.VDI Clone.VDI --variant Fixed --remember

If the original and clone disk images are of the same format and variant, they must be identical, except for the UUIDs. However, it takes VBoxManage over seven minutes to clone a 5GB image on my system — a bit too slow even for the "green" hard drives. Which led me to discovering an undocumented VBoxManage feature: you may simply copy original-image to clone-image and run

VBoxManage internalcommands sethduuid clone-image

to assign a new random UUID to the clone-image.

Copying the same 5GB image takes three minutes on my system and UUID then changes in an instant, so this is already an improvement over VBoxManage clonehd. Taking it one step further, I compress the original image. That 5GB virtual disk contained just the baseline Ubuntu Server installation and thus was only 24% full, so 7-Zip managed to compress it to just 318MB (for your reference, bzip2 --best yielded 378MB, gzip -c9 - 415MB on the same image, so I stuck to 7-Zip.)

Extraction of the image then took just 80 seconds - 5x improvement over VBoxManage clonehd!

Disk Image Cleanup

Updated 07-Sep-2012:

You may wish to clean up the free space on your virtual hard disk prior to cloning, so as to facilitate better compression. It does not make much difference for a disk image that only contains a freshly installed system, but if it has been in use for a while, or you have installed security updates and wiped out old kernels and headers, use the zerofree utility that was designed specifically for this purpose:

sudo telinit 1          # Bring the system down into single-user mode
   .  .  .
stop rsyslog            # rsyslog keeps the root filesystem busy
mount -o remount,ro /   # Remount the root filesystem read-only
zerofree /dev/sda1      # or whatever is mounted on /
reboot

If you don't want to mess with zerofree, you may simply fill the free space with a file of zeroes:

dd if=/dev/zero of=bigemptyfile || rm bigemptyfile  # dd shall run out of disk space, thus the ||

Note however that it would grow a dynamically allocated disk image to its maximum size.

Avoiding MAC Address Conflict

Unless you will never ever have two clones, or a clone and the original, running side-by-side on the same host, you need to change the MAC addresses of the virtual network adapters in each clone to avoid conflicts. You can do that by opening the VM settings dialog, selecting Network, expanding the Advanced section of each network adapter tab, and clicking the button next to the Mac Address field. Or from the command line as follows:

VBoxManage modifyvm vm-name | vm-uuid --macaddress1 auto --macaddress2 auto ...

Now, the problem is that most Linux flavors maintain persistent mapping of ethX device names to MAC addresses. Responsible for persistence is the udev daemon; unfortunately, the location of its configuration files is inconsistent across Linux distros. Specifically Ubuntu and possibly other Debian derivatives store ethX device mappings in /etc/udev/rules.d/70-persistent-net.rules:

# PCI device 0x8086:0x100e (e1000)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="08:00:27:2c:8c:b4",
ATTR{dev_id}=="0x0", ATTR{type}=="1", KERNEL=="eth*", NAME="eth0"

For instance, if you had two virtual network adapters in the original VM, named eth0 and eth1, and changed their MAC addresses in the clone, udev will add them as new adapters eth2 and eth3 during boot and networking won't work. So the trick is to remove these mappings before shutting down the original VM, and change the MAC address(es) in each clone before booting it for the first time.

One final note: I am yet to see this in Ubuntu guests on VirtualBox, but rumor has it that Linux may enumerate network adapters in a different order after changes. If you had two or more adapters in the original VM and udev messed them up in the clone, simply edit their NAME attributes as appropriate and restart udev (or reboot the system):

sudo nano /etc/udev/rules.d/70-persistent-net.rules   # edit adapter names
sudo restart udev

Changing The Host Name

You will likely want each clone to have its own host name, especially if you are using DHCP. One problem is that the host name is stored not only at the "master" location (/etc/hostname on Ubuntu/Debian) but in a few other places, which must be kept in sync to ensure proper operation. Even in the baseline Ubuntu installation, there is a trap: if you change the hostname in /etc/hostname so that it no longer resolves to an IP address (normally via /etc/hosts), sudo will complain that it cannot resolve it:

user@baseline:~$ sudo sh -c "echo clone >/etc/hostname"
user@baseline:~$ sudo start hostname
sudo: unable to resolve host clone

As of Ubuntu 10.04, sudo still works in this situation, so you may then edit the /etc/hosts file and the problem will be gone, but on other systems you may have to boot in recovery mode.

To avoid this problem, you may wish to make all edits from a root shell:

sudo -s
echo newname >/etc/hostname
nano /etc/hosts   # change oldname to newname
start hostname

Now, if you log out or switch to another tty and press Enter at the login prompt, you should see something like:

Ubuntu 10.04.1 LTS newname tty2

newname login:

Bad news are that some packages read the hostname only upon installation and store it in their configuration files, so you may need to propagate the change manually to ensure proper operation.

References:

Linux Hostname Configuration by Jason Blevins

Tags: , , ,

« | »

Talkback

  1. Sasha Egorov
    13-Nov-2014
    4:08 am
    1

    Thanks! Nice write up. Nowadays we have Vagrant for rapid VM cloning/managing. It was interesting to read.

* Copy This Password *

* Type Or Paste Password Here *