Datasources#
Datasources are sources of configuration data for cloud-init
that typically
come from the user (i.e., user data) or come from the cloud that created the
configuration drive (i.e., metadata). Typical user data includes files,
YAML, and shell scripts whereas typical metadata includes server name,
instance id, display name, and other cloud specific details.
Since there are multiple ways to provide this data (each cloud solution seems to prefer its own way), a datasource abstract class was internally created to allow for a single way to access the different cloud systems methods, providing this data through the typical usage of subclasses.
Any metadata processed by cloud-init
’s datasources is persisted as
/run/cloud-init/instance-data.json
. Cloud-init
provides tooling to
quickly introspect some of that data. See Instance metadata for more
information.
Known sources#
The following is a list of documents for each supported datasource:
Datasource creation#
The datasource objects have a few touch points with cloud-init
. If you
are interested in adding a new datasource for your cloud platform you will
need to take care of the following items:
Identify a mechanism for positive identification of the platform#
It is good practice for a cloud platform to positively identify itself to
the guest. This allows the guest to make educated decisions based on the
platform on which it is running. On the x86 and arm64 architectures, many
clouds identify themselves through DMI data. For example, Oracle’s public
cloud provides the string 'OracleCloud.com'
in the DMI chassis-asset
field.
Cloud-init
-enabled images produce a log file with details about the
platform. Reading through this log in /run/cloud-init/ds-identify.log
may provide the information needed to uniquely identify the platform.
If the log is not present, you can generate it by running from source
./tools/ds-identify
or the installed location
/usr/lib/cloud-init/ds-identify
.
The mechanism used to identify the platform will be required for the
ds-identify
and datasource module sections below.
Add datasource module cloudinit/sources/DataSource<CloudPlatform>.py
#
It is suggested that you start by copying one of the simpler datasources
such as DataSourceHetzner
.
Add tests for datasource module#
Add a new file with some tests for the module to
cloudinit/sources/test_<yourplatform>.py
. For example, see
cloudinit/sources/tests/test_oracle.py
Update ds-identify
#
In systemd
systems, ds-identify
is used to detect which datasource
should be enabled, or if cloud-init
should run at all. You’ll need to
make changes to tools/ds-identify
.
Add tests for ds-identify
#
Add relevant tests in a new class to
tests/unittests/test_ds_identify.py
. You can use TestOracle
as
an example.
Add your datasource name to the built-in list of datasources#
Add your datasource module name to the end of the datasource_list
entry in cloudinit/settings.py
.
Add your cloud platform to apport collection prompts#
Update the list of cloud platforms in cloudinit/apport.py
. This list
will be provided to the user who invokes ubuntu-bug cloud-init.
Enable datasource by default in Ubuntu packaging branches#
Ubuntu packaging branches contain a template file,
debian/cloud-init.templates
, which ultimately sets the default
datasource_list
when installed via package. This file needs updating when
the commit gets into a package.
Add documentation for your datasource#
You should add a new file in doc/datasources/<cloudplatform>.rst
.