New user tutorial with QEMU

In this tutorial, we will launch an Ubuntu cloud image in a virtual machine that uses cloud-init to pre-configure the system during boot.

The goal of this tutorial is to provide a minimal demonstration of cloud-init, which you can then use as a development environment to test your cloud-init configuration locally before launching it to the cloud.

Why QEMU?

QEMU is a cross-platform emulator capable of running performant virtual machines. QEMU is used at the core of a range of production operating system deployments and open source software projects (including libvirt, LXD, and vagrant). It is capable of running Windows, Linux, and Unix guest operating systems. While QEMU is flexibile and feature-rich, we are using it because it is widely supported and able to run on *nix-derived operating systems.

If you do not already have QEMU installed, you can install it by running the following command in Ubuntu:

$ sudo apt install qemu-system-x86

If you are not using Ubuntu, you can visit QEMU’s install instructions to see details for your system.

Download a cloud image

First, we’ll set up a temporary directory that will store both our cloud image and the configuration files we’ll create in the next section. Let’s also make it our current working directory:

$ mkdir temp
$ cd temp

We will run all the commands from this temporary directory. If we run the commands from anywhere else, our virtual machine will not be configured.

Cloud images typically come with cloud-init pre-installed and configured to run on first boot. We don’t need to worry about installing cloud-init for now, since we are not manually creating our own image in this tutorial.

In our case, we want to select the latest Ubuntu LTS. Let’s download the server image using wget:

$ wget https://cloud-images.ubuntu.com/jammy/current/jammy-server-cloudimg-amd64.img

Note

This example uses emulated CPU instructions on non-x86 hosts, so it may be slow. To make it faster on non-x86 architectures, one can change the image type and qemu-system-<arch> command name to match the architecture of your host machine.

Define the configuration data files

When we launch an instance using cloud-init, we pass different types of configuration data files to it. Cloud-init uses these as a blueprint for how to configure the virtual machine instance. There are three major types:

  • User data is provided by the user, and cloud-init recognises many different formats.

  • vendor data is provided by the cloud provider.

  • Metadata contains the metadata about the instance itself, including things like machine ID, hostname, etc.

There is a specific user data format called “cloud-config” that is probably the most commonly used, so we will create an example of this (and examples of both vendor data and metadata files), then pass them all to cloud-init.

Let’s create our user-data file first. The user data cloud-config is a YAML-formatted file, and in this example it sets the password of the default user, and sets that password to never expire. For more details you can refer to the Set Passwords module page.

Run the following command to create the user data file (named user-data) containing our configuration data.

$ cat << EOF > user-data
#cloud-config
password: password
chpasswd:
  expire: False

EOF

Before moving forward, let’s first inspect our user-data file.

cat user-data

You should see the following contents:

#cloud-config
password: password
chpasswd:
  expire: False
  • The first line starts with #cloud-config, which tells cloud-init what type of user data is in the config file.

  • The second line, password: password sets the default user’s password to password, as per the Users and Groups module documentation.

  • The third and fourth lines tell cloud-init not to require a password reset on first login.

Now let’s run the following command, which creates a file named meta-data containing the instance ID we want to associate to the virtual machine instance.

$ cat << EOF > meta-data
instance-id: someid/somehostname

EOF

Next, let’s create an empty file called vendor-data in our temporary directory. This will speed up the retry wait time.

$ touch vendor-data

Start an ad hoc IMDS webserver

Instance Metadata Service (IMDS) is a service used by most cloud providers as a way to expose information to virtual machine instances. This service is the primary mechanism for some clouds to expose cloud-init configuration data to the instance.

The IMDS uses a private http webserver to provide metadata to each running instance. During early boot, cloud-init sets up network access and queries this webserver to gather configuration data. This allows cloud-init to configure the operating system while it boots.

In this tutorial we are emulating this workflow using QEMU and a simple Python webserver. This workflow is suitable for developing and testing cloud-init configurations before deploying to a cloud.

Open up a second terminal window, and in that window, run the following commands to change to the temporary directory and then start the built-in Python webserver:

$ cd temp
$ python3 -m http.server --directory .

Launch a VM with our user data

Switch back to your original terminal, and run the following command to launch our virtual machine. By default, QEMU will print the kernel logs and systemd logs to the terminal while the operating system boots. This may take a few moments to complete.

$ qemu-system-x86_64                                            \
    -net nic                                                    \
    -net user                                                   \
    -machine accel=kvm:tcg                                      \
    -m 512                                                      \
    -nographic                                                  \
    -hda jammy-server-cloudimg-amd64.img                        \
    -smbios type=1,serial=ds='nocloud;s=http://10.0.2.2:8000/'

Note

If the output stopped scrolling but you don’t see a prompt yet, press Enter to get to the login prompt.

When launching QEMU, our machine configuration is specified on the command line. Many things may be configured: memory size, graphical output, networking information, hard drives and more.

Let us examine the final two lines of our previous command. The first of them, -hda jammy-server-cloudimg-amd64.img, tells QEMU to use the cloud image as a virtual hard drive. This will cause the virtual machine to boot Ubuntu, which already has cloud-init installed.

The second line tells cloud-init where it can find user data, using the NoCloud datasource. During boot, cloud-init checks the SMBIOS serial number for ds=nocloud. If found, cloud-init will use the specified URL to source its user data config files.

In this case, we use the default gateway of the virtual machine (10.0.2.2) and default port number of the Python webserver (8000), so that cloud-init will, inside the virtual machine, query the server running on host.

Verify that cloud-init ran successfully

After launching the virtual machine, we should be able to connect to our instance using the default distro username.

In this case the default username is ubuntu and the password we configured is password.

If you can log in using the configured password, it worked!

If you couldn’t log in, see this page for debug information.

Let’s now check cloud-init’s status. Run the following command, which will allow us to check if cloud-init has finished running:

$ cloud-init status --wait

If you see status: done in the output, it succeeded!

If you see a failed status, you’ll want to check /var/log/cloud-init.log for warning/error messages.

Completion and next steps

In our main terminal, let’s exit the QEMU shell using Ctrl-A X (that’s Ctrl and A simultaneously, followed by X).

In the second terminal, where the Python webserver is running, we can stop the server using (Ctrl-C).

In this tutorial, we configured the default user’s password and ran cloud-init inside our QEMU virtual machine.

The full list of modules available can be found in our modules documentation. The documentation for each module contains examples of how to use it.

You can also head over to the examples page for examples of more common use cases.