Packer node builder
Building a custom VM image for deploying K8S nodes on bare metal
Technical

Introduction

In this article I will discuss my k8s-node-packer folder, which contains an environment I use to build fresh VM images which I then use to deploy VMs on various bare metal hosts running libvirtd.

Getting started

You will need a recent version of Packer. To summarise the installation for Ubuntu…

curl -fsSL https://apt.releases.hashicorp.com/gpg | sudo apt-key add -
sudo apt-add-repository "deb [arch=amd64] https://apt.releases.hashicorp.com $(lsb_release -cs) main"
sudo apt-get update && sudo apt-get install packer

Ensure some tools that will be needed are installed:

sudo apt install make mkisofs

Add yourself to the kvm group.

sudo usermod -a -G kvm $USER

Log out and back in to make use of this new group membership.

Configuration

Here’s the beginnings of the Makefile:

PACKER_LOG=1

all: build

deps:
	packer plugins install github.com/hashicorp/qemu

cidata.iso:
	@echo "Creating CI ISO..."
	@genisoimage -output cidata.iso -input-charset utf-8 -volid cidata -joliet -r cloud-init/user-data cloud-init/meta-data

build: cidata.iso
	@echo "Building image..."
	PACKER_LOG=$(PACKER_LOG) BUILD_ID=$(shell date +'%Y%m%d%H%M%S') packer build ubuntu.pkr.hcl

You can see this refers to a few files:

  • cloud-init/user-data - Contains commands and settings for CloudInit to use to pre-provision the VM image.
  • cloud-init/meta-data - Also related to CloudInit configuration.
  • ubuntu.pkr.hcl - The Packer definition of how to create the VM image.

The first two files in the cloud-init folder are combined into a tiny cidata.iso ISO image, which is then used to bootstrap a Packer build. As Packer runs up the QEMU VM, CloudInit takes responsibility for pre-provisioning the VM with all the necessary packages and settings.

The ubuntu.pkr.hcl file describes what Packer should do when it is run. It contains the following:

packer {
  required_plugins {
    qemu = {
      source  = "github.com/hashicorp/qemu"
      version = "~> 1"
    }
  }
}

variable "ubuntu_version" {
  type        = string
  default     = "noble"
  description = "Ubuntu codename version (i.e. 20.04 is focal and 24.04 is noble)"
}

variable "build_id" {
  type        = string
  default = env("BUILD_ID")
  description = "Build ID"
}

source "qemu" "ubuntu" {
  accelerator      = "kvm"
  cd_files         = ["./cloud-init/*"]
  cd_label         = "cidata"
  disk_compression = true
  disk_image       = true
  disk_size        = "10G"
  headless         = true
  iso_checksum     = "file:https://cloud-images.ubuntu.com/${var.ubuntu_version}/current/SHA256SUMS"
  iso_url          = "https://cloud-images.ubuntu.com/${var.ubuntu_version}/current/${var.ubuntu_version}-server-cloudimg-amd64.img"
  output_directory = "output-${var.ubuntu_version}-${var.build_id}"
  packer_build_name = "ubuntu-${var.ubuntu_version}-${var.build_id}"
  packer_builder_type = "qemu"
  packer_on_error = "cleanup"
  packer_user_variables = {
    "BUILD_ID" = "${var.build_id}"
  }
  shutdown_command = "echo 'packer' | sudo -S shutdown -P now"
  #ssh_password     = "chooseapassword" # private key preferred
  ssh_username     = "packer"
  ssh_private_key_file = "./id_ed25519"
  ssh_timeout = "5m"
  vm_name          = "ubuntu-${var.ubuntu_version}.img"

  qemuargs = [
    ["-m", "8192M"],
    ["-smp", "2"],
    ["-cdrom", "cidata.iso"],
    ["-serial", "mon:stdio"],
  ]
}

build {
  sources = ["source.qemu.ubuntu"]

  provisioner "shell" {
    // run scripts with sudo, as the default cloud image user is unprivileged
    execute_command = "echo 'packer' | sudo -S sh -c '{{ .Vars }} {{ .Path }}'"
    // NOTE: cleanup.sh should always be run last, as this performs post-install cleanup tasks
    scripts = [
      "scripts/install.sh",
      "scripts/cleanup.sh"
    ]
  }
}

You’ll see this refers to a couple of scripts in a scripts folder. I’ll come to those later.

When the VM is run up, it runs CloudInit first, so let’s look at the configuration files for that.

The meta-data file is simple.

package_update: true
package_upgrade: true
package_reboot_if_required: true

The user-data file contains the real substance of the configuration.

#cloud-config

disable_root: true

users:
  - name: packer
    gecos: Provisioning user
    ssh_authorized_keys:
      - ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AABBIFHHzTQL5gLbfcWNvW+sqtPh+ob8hRxNuhoIP4grHND6 rossg@x1c
      - ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIPQOWclractZtu1gohYRpGXBYdC3GO3yfd2Iy1LkNCdE packer
    sudo: ["ALL=(ALL) NOPASSWD:ALL"]
    groups:
      - admin
      - sudo
    shell: /bin/bash

apt:
  sources:
    falco.list:
      source: "deb [arch=amd64] https://download.falco.org/packages/deb stable main"
    fluent-bit:
      source: "deb [arch=amd64] https://packages.fluentbit.io/ubuntu/noble noble main"
    docker:
      source: "deb [arch=amd64] https://download.docker.com/linux/ubuntu noble stable"
    kubernetes:
      source: "deb [arch=amd64] https://pkgs.k8s.io/core:/stable:/v1.31/deb/ /"
    hashicorp.list:
      source: "deb [arch=amd64] https://apt.releases.hashicorp.com noble main"
    ansible.list:
      source: "deb [arch=amd64] https://ppa.launchpadcontent.net/ansible/ansible/ubuntu/ noble main"

bootcmd:
  - curl -fsSL https://falco.org/repo/falcosecurity-packages.asc | gpg --dearmor -o - > /etc/apt/trusted.gpg.d/falco.gpg
  - curl -fsSL https://packages.fluentbit.io/fluentbit.key | gpg --dearmor -o - > /etc/apt/trusted.gpg.d/fluent-bit.gpg
  - curl -fsSL https://download.docker.com/linux/ubuntu/gpg | gpg --dearmor -o - > /etc/apt/trusted.gpg.d/docker.gpg
  - curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.31/deb/Release.key | gpg --dearmor -o - > /etc/apt/trusted.gpg.d/kubernetes.gpg
  - curl -fsSL https://apt.releases.hashicorp.com/gpg | gpg --dearmor -o - > /etc/apt/trusted.gpg.d/hashicorp.gpg
  - curl -fsSL "https://keyserver.ubuntu.com/pks/lookup?op=get&search=0x6125e2a8c77f2818fb7bd15b93c4a3fd7bb9c367" | gpg --dearmor -o - > /etc/apt/trusted.gpg.d/ansible.gpg
  - apt-get update

runcmd:
  - mkdir /etc/containerd && containerd config default > /etc/containerd/config.toml
  - sed -i 's;registry.k8s.io/pause:3.8;registry.k8s.io/pause:3.9;g' /etc/containerd/config.toml
  - sed -i 's;SystemdCgroup = false;SystemdCgroup = true;g' /etc/containerd/config.toml

packages:
  - qemu-guest-agent
  - cloud-initramfs-growroot
  - openssh-server
  - ca-certificates
  - curl
  - gnupg
  - lsb-release
  - vim
  - net-tools
  - sudo
  - tcpdump
  - rsync
  - open-iscsi
  - htop
  - git
  - iputils-ping
  - traceroute
  - bind9-dnsutils
  - iotop
  - ansible-galaxy
  - ansible-core
  - containerd
  - kubeadm
  - kubelet
  - vault
  - falco
  - lvm2
  - keepalived
  - haproxy

write_files:
  - path: /root/.bash_aliases
    permissions: '0644'
    owner: root:root
    content: |
      export CONTAINER_RUNTIME_ENDPOINT=unix:///run/containerd/containerd.sock      

  - path: /etc/modules-load.d/k8s.conf
    content: |
      overlay
      br_netfilter      

  - path: /etc/apt/apt.conf.d/50unattended-upgrades
    content: |
      Unattended-Upgrade::Allowed-Origins {
        "*:*";
      };      

  - path: /var/lib/kubelet/kubeadm-flags.env
    permissions: '0644'
    owner: root:root
    content: |
      KUBELET_KUBEADM_ARGS="--container-runtime-endpoint=unix:///run/containerd/containerd.sock"      

  - path: /etc/sysctl.d/k8s.conf
    content: |
      fs.inotify.max_user_instances=8192
      fs.inotify.max_user_watches=524288
      kernel.panic=10
      kernel.panic_on_oops=1
      net.bridge.bridge-nf-call-iptables=1
      net.bridge.bridge-nf-call-ip6tables=1
      net.ipv4.conf.all.log_martians=0
      net.ipv4.conf.default.log_martians=0
      net.ipv4.ip_forward=1
      vm.overcommit_memory=1      

The install.sh script is run after CloudInit has finished. In our case, it is used to pre-pull the latest K8S images and Cilium binaries (to improve bootstrap time per VM).

#!/bin/bash -eux

echo "==> Waiting for cloud-init to finish..."
while [ ! -f /var/lib/cloud/instance/boot-finished ]; do
    echo 'Waiting for Cloud-Init...'
    sleep 3
done

echo "==> Preloading K8S images..."
kubeadm config images pull

echo "==> Installing Cilium binary..."
CILIUM_CLI_VERSION=$(curl -s https://raw.githubusercontent.com/cilium/cilium-cli/main/stable.txt)
CLI_ARCH=amd64
curl -sSL --fail --remote-name-all https://github.com/cilium/cilium-cli/releases/download/${CILIUM_CLI_VERSION}/cilium-linux-${CLI_ARCH}.tar.gz{,.sha256sum}
sha256sum --check cilium-linux-${CLI_ARCH}.tar.gz.sha256sum || exit "Cilium checksum failure!"
sudo tar xzvfC cilium-linux-${CLI_ARCH}.tar.gz /usr/local/bin
rm cilium-linux-${CLI_ARCH}.tar.gz{,.sha256sum}

# Other prep here.

echo "==> Done."

The cleanup.sh script ensures the VM doesn’t contain stale left-overs from the install process, and clears down other unused or unwanted stuff to optimise the image size.

#!/bin/bash -eux

echo "==> remove unwanted packages"
dpkg --purge --force-depends \
      ansible \
      ansible-galaxy \
      bolt \
      byobu \
      command-not-found \
      ftp \
      info \
      landscape-common \
      lxd \
      lxd-agent-loader \
      pastebinit \
      modemmanager \
      mdadm \
      nano \
      ntfs-3g \
      plymouth \
      plymouth-theme-ubuntu-text \
      sosreport \
      snapd \
      tnftp \
      tmux \
      telnet \
      ubuntu-release-upgrader-core \
      ubuntu-pro-client-l10n \
      ubuntu-server \
      ubuntu-standard \
      update-notifier-common \
      update-manager-core \
      ufw \
      wpasupplicant
apt-get autoremove -y

echo "==> remove SSH keys used for building"
rm -f /home/ubuntu/.ssh/authorized_keys
rm -f /root/.ssh/authorized_keys

echo "==> Clear out machine id"
truncate -s 0 /etc/machine-id

echo "==> Remove the contents of /tmp and /var/tmp"
rm -rf /tmp/* /var/tmp/*

echo "==> Truncate any logs that have built up during the install"
find /var/log -type f -exec truncate --size=0 {} \;

echo "==> Cleanup bash history"
rm -f ~/.bash_history

echo "remove /usr/share/doc/"
rm -rf /usr/share/doc/*

echo "==> remove /var/cache"
find /var/cache -type f -exec rm -rf {} \;

echo "==> Cleanup apt"
apt-get -y autoremove
sudo apt-get clean
sudo rm -rf /var/lib/apt/lists/*

echo "==> force a new random seed to be generated"
rm -f /var/lib/systemd/random-seed

echo "==> Clear the history so our install isn't there"
rm -f /root/.wget-hsts

echo "==> Remove systemd-resolved which does not play well with K8S nodes"
systemctl disable systemd-resolved
systemctl stop systemd-resolved
rm -vf /etc/resolv.conf

export HISTSIZE=0

You should end up with the following directory structure:

├── Makefile
├── cloud-init
│   ├── meta-data
│   └── user-data
├── scripts
│   ├── cleanup.sh
│   └── install.sh
└── ubuntu.pkr.hcl

Building an image

You need to run make deps first to ensure the QEMU plugin is installed. Then just make or make build will initiate the build process, including generating the cidata.iso image if it hasn’t already been built.

The end result will end up in a folder, named by the build date.

├── output-noble-20241128074340
│   └── ubuntu-noble.img

You can now push that to your hosts and deploy it.

Summary

So, we’ve:

  • Set up an environment to build a VM image.
  • Built an image.

I’ll cover deployment another time.