User data formats

User data is configuration data provided by a user of a cloud platform to an instance at launch. User data can be passed to cloud-init in any of many formats documented here. User data is combined with the other configuration sources to create a combined configuration which modifies an instance.

Configuration types

User data formats can be categorized into those that directly configure the instance, and those that serve as a container, template, or means to obtain or modify another configuration.

Formats that directly configure the instance:

Formats that deal with other user data formats:

Cloud config data

Example

#cloud-config
password: password
chpasswd:
  expire: False

Explanation

Cloud-config can be used to define how an instance should be configured in a human-friendly format. The cloud config format uses YAML with keys which describe desired instance state.

These things may include:

  • performing package upgrades on first boot

  • configuration of different package mirrors or sources

  • initial user or group setup

  • importing certain SSH keys or host keys

  • and many more…

Many modules are available to process cloud-config data. These modules may run once per instance, every boot, or once ever. See the associated module to determine the run frequency.

For more information, see the cloud config example configurations or the cloud config modules reference.

User data script

Example

#!/bin/sh
echo "Hello World" > /var/tmp/output.txt

Explanation

A user data script is a single script to be executed once per instance. User data scripts are run relatively late in the boot process, during cloud-init’s final stage as part of the cc_scripts_user module.

Warning

Use of INSTANCE_ID variable within user data scripts is deprecated. Use jinja templates with v1.instance_id instead.

Cloud boothook

Simple Example

#cloud-boothook
#!/bin/sh
echo 192.168.1.130 us.archive.ubuntu.com > /etc/hosts

Example of once-per-instance script

#cloud-boothook
#!/bin/sh

# Early exit 0 when script has already run for this instance-id,
# continue if new instance boot.
cloud-init-per instance do-hosts /bin/false && exit 0
echo 192.168.1.130 us.archive.ubuntu.com >> /etc/hosts

Explanation

A cloud boothook is similar to a user data script in that it is a script run on boot. When run, the environment variable INSTANCE_ID is set to the current instance ID for use within the script.

The boothook is different in that:

  • It is run very early in boot, during the network stage, before any cloud-init modules are run.

  • It is run on every boot

Warning

Use of INSTANCE_ID variable within boothooks is deprecated. Use jinja templates with v1.instance_id instead.

Include file

Example

#include
https://raw.githubusercontent.com/canonical/cloud-init/403f70b930e3ce0f05b9b6f0e1a38d383d058b53/doc/examples/cloud-config-run-cmds.txt
https://raw.githubusercontent.com/canonical/cloud-init/403f70b930e3ce0f05b9b6f0e1a38d383d058b53/doc/examples/cloud-config-boot-cmds.txt

Explanation

An include file contains a list of URLs, one per line. Each of the URLs will be read and their content can be any kind of user data format, both base config and meta config. If an error occurs reading a file the remaining files will not be read.

Jinja template

Example cloud-config

## template: jinja
#cloud-config
runcmd:
  - echo 'Running on {{ v1.cloud_name }}' > /var/tmp/cloud_name

Example user data script

## template: jinja
#!/bin/sh
echo 'Current instance id: {{ v1.instance_id }}' > /var/tmp/instance_id

Explanation

Jinja templating may be used for cloud-config and user data scripts. Any instance-data variables may be used as jinja template variables. Any jinja templated configuration must contain the original header along with the new jinja header above it.

Note

Use of Jinja templates is supported for cloud-config, user data scripts, and cloud-boothooks. Jinja templates are not supported for meta configs.

MIME multi-part archive

Example

Content-Type: multipart/mixed; boundary="===============2389165605550749110=="
MIME-Version: 1.0
Number-Attachments: 2

--===============2389165605550749110==
Content-Type: text/cloud-boothook; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment; filename="part-001"

#!/bin/sh
echo "this is from a boothook." > /var/tmp/boothook.txt

--===============2389165605550749110==
Content-Type: text/cloud-config; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment; filename="part-002"

bootcmd:
- echo "this is from a cloud-config." > /var/tmp/bootcmd.txt
--===============2389165605550749110==--

Explanation

Using a MIME multi-part file, the user can specify more than one type of data.

For example, both a user data script and a cloud-config type could be specified.

Each part must specify a valid content types. Supported content-types may also be listed from the cloud-init subcommand make-mime:

$ cloud-init devel make-mime --list-types

Helper subcommand to generate MIME messages

The cloud-init make-mime subcommand can also generate MIME multi-part files.

The make-mime subcommand takes pairs of (filename, “text/” mime subtype) separated by a colon (e.g., config.yaml:cloud-config) and emits a MIME multipart message to stdout.

MIME subcommand Examples

Create user data containing both a cloud-config (config.yaml) and a shell script (script.sh)

$ cloud-init devel make-mime -a config.yaml:cloud-config -a script.sh:x-shellscript > userdata

Create user data containing 3 shell scripts:

  • always.sh - run every boot

  • instance.sh - run once per instance

  • once.sh - run once

$ cloud-init devel make-mime -a always.sh:x-shellscript-per-boot -a instance.sh:x-shellscript-per-instance -a once.sh:x-shellscript-per-once

Cloud config archive

Example

#cloud-config-archive
- type: "text/cloud-boothook"
  content: |
    #!/bin/sh
    echo "this is from a boothook." > /var/tmp/boothook.txt
- type: "text/cloud-config"
  content: |
    bootcmd:
    - echo "this is from a cloud-config." > /var/tmp/bootcmd.txt

Explanation

A cloud-config-archive is a way to specify more than one type of data using YAML. Since building a MIME multipart archive can be somewhat unwieldly to build by hand or requires using a cloud-init helper utility, the cloud-config-archive provides a simpler alternative to building the MIME multi-part archive for those that would prefer to use YAML.

The format is a list of dictionaries.

Required fields:

  • type: The Content-Type identifier for the type of user data in content

  • content: The user data configuration

Optional fields:

  • launch-index: The EC2 Launch-Index (if applicable)

  • filename: This field is only used if using a user data format that requires a filename in a MIME part. This is unrelated to any local system file.

All other fields will be interpreted as a MIME part header.

Part handler

Example

 1#part-handler
 2
 3"""This is a trivial example part-handler that creates a file with the path
 4specified in the payload. It performs no input checking or error handling.
 5
 6To use it, first save the file you are currently viewing into your current
 7working directory. Then run the following:
 8```
 9$ echo '/var/tmp/my_path' > part
10$ cloud-init devel make-mime -a part-handler.py:part-handler -a part:x-my-path --force > user-data
11```
12
13This will create a mime file with the contents of 'part' and the
14part-handler. You can now pass 'user-data' to your cloud of choice.
15
16When run, cloud-init will have created an empty file at /var/tmp/my_path.
17"""
18
19import pathlib
20from typing import Any
21
22from cloudinit.cloud import Cloud
23
24
25def list_types():
26    """Return a list of mime-types that are handled by this module."""
27    return ["text/x-my-path"]
28
29
30def handle_part(data: Cloud, ctype: str, filename: str, payload: Any):
31    """Handle a part with the given mime-type.
32
33    This function will get called multiple times. The first time is
34    to allow any initial setup needed to handle parts. It will then get
35    called once for each part matching the mime-type returned by `list_types`.
36    Finally, it will get called one last time to allow for any final
37    teardown.
38
39    :data: A `Cloud` instance. This will be the same instance for each call
40        to handle_part.
41    :ctype: '__begin__', '__end__', or the mime-type
42        (for this example 'text/x-my-path') of the part
43    :filename: The filename for the part as defined in the MIME archive,
44        or dynamically generated part if no filename is given
45    :payload: The content of the part. This will be
46        `None` when `ctype` is '__begin__' or '__end__'.
47    """
48    if ctype == "__begin__":
49        # Any custom setup needed before handling payloads
50        return
51
52    if ctype == "__end__":
53        # Any custom teardown needed after handling payloads can happen here
54        return
55
56    # If we've made it here, we're dealing with a real payload, so handle
57    # it appropriately
58    pathlib.Path(payload.strip()).touch()

Explanation

A part handler contains custom code for either supporting new mime-types in multi-part user data or for overriding the existing handlers for supported mime-types.

See the custom part handler reference documentation for details on writing custom handlers along with an annotated example.

This blog post offers another example for more advanced usage.

Gzip compressed content

Content found to be gzip compressed will be uncompressed. The uncompressed data will then be used as if it were not compressed. This is typically useful because user data size may be limited based on cloud platform.

Headers and content types

In order for cloud-init to recognize which user data format is being used, the user data must contain a header. Additionally, if the user data is being passed as a multi-part message, such as MIME, cloud-config-archive, or part-handler, the content-type for each part must also be set appropriately.

The table below lists the headers and content types for each user data format. Note that gzip compressed content is not represented here as it gets passed as binary data and so may be processed automatically.

User data format

Header

Content-Type

Cloud config data

#cloud-config

text/cloud-config

User data script

#!

text/x-shellscript

Cloud boothook

#cloud-boothook

text/cloud-boothook

MIME multi-part

Content-Type: multipart/mixed

multipart/mixed

Cloud config archive

#cloud-config-archive

text/cloud-config-archive

Jinja template

## template: jinja

text/jinja

Include file

#include

text/x-include-url

Part handler

#part-handler

text/part-handler

Continued reading

See the configuration sources documentation for information about other sources of configuration for cloud-init.