Initializing a k3s cluster with Ansible
In a previous post, I described how I used Terraform and Ansible to provision Debian VMs on Proxmox. In my example, the Terraform project provisioned 3 master nodes and 3 worker nodes. In this post, I will share the playbook for installing and initializing a k3s cluster using a floating IP with keepalived to ensure the Kubernetes API can be reached if any of the master nodes are unavailable.
Directory Structure
In the Ansible repository, I have a k3s_cluster role which is self-contained for everything related to the k3s cluster. That includes the cluster itself as well as all of the applications as sub-roles.
- roles/k3s_cluster/k3s/master initializes the master nodes
- roles/k3s_cluster/k3s/node initializes the worker nodes
- roles/k3s_cluster/download downloads the k3s installer
- roles/k3s_cluster/prereq takes care of any pre-install configuration tasks
- inventory/group_vars/k3s_cluster contains variables which apply to the entire cluster
There are many other sub-roles, one for each application deployed onto the cluster. I will cover each one in future posts.
Pre-requisites Role
The prereq role can be expanded to include any tasks specific to the Linux distribution used. I’ve only included the Debian tasks because that what I used.
The cluster uses Longhorn for storage and I chose to create a separate virtual disk, mounted as its own filesystem, for the Longhorn data. The role initializes and mounts this filesystem.
roles/k3s_cluster/prereq/tasks/main.yml:
---
- name: execute OS related tasks
include_tasks: "{{ item }}"
with_first_found:
- "{{ detected_distribution }}-{{ detected_distribution_major_version }}.yml"
- "{{ detected_distribution }}.yml"
- "{{ ansible_distribution }}-{{ ansible_distribution_major_version }}.yml"
- "{{ ansible_distribution }}.yml"
- "default.yml"
- name: Configure storage
include_tasks: "storage.yml"
- name: Enable IPv4 forwarding
sysctl:
name: net.ipv4.ip_forward
value: "1"
state: present
reload: yes
- name: Enable IPv6 forwarding
sysctl:
name: net.ipv6.conf.all.forwarding
value: "1"
state: present
reload: yes
- name: Add /usr/local/bin to sudo secure_path
lineinfile:
line: "Defaults secure_path = /sbin:/bin:/usr/sbin:/usr/bin:/usr/local/bin"
regexp: "Defaults(\\s)*secure_path(\\s)*="
state: present
insertafter: EOF
path: /etc/sudoers
validate: "visudo -cf %s"
when: ansible_distribution in ['CentOS', 'Red Hat Enterprise Linux', 'Debian']
roles/k3s_cluster/prereq/tasks/Debian.yml:
---
- name: Timezone Data
apt:
name: tzdata
state: latest
- name: Set Timezone
community.general.timezone:
name: {{ timezone }}
- name: Flush iptables
iptables:
flush: true
changed_when: false # iptables flush always returns changed
- name: Changing to iptables-legacy # nftables not currently supported
alternatives:
path: /usr/sbin/iptables-legacy
name: iptables
register: ip4_iptables
- name: Changing to ip6tables-legacy # nftables not currently supported
alternatives:
path: /usr/sbin/ip6tables-legacy
name: ip6tables
register: ip6_iptables
# Install aptitude with apt-get
- name: Install aptitude
apt: name='aptitude' update_cache=yes state=latest force_apt_get=yes
# Update apt cache and upgrade all packages to the latest erion
- name: Update apt-cache
apt: update_cache=yes
- name: Python3
apt:
name: python3
state: latest
- name: Pip3
apt:
name: python3-pip
state: latest
# Dependencies required for Longhorn
- name: parted
apt:
name: parted
state: latest
- name: iscsi Tools
apt:
name: open-iscsi
state: latest
- name: NFS client
apt:
name: nfs-common
state: latest
- name: jq
apt:
name: jq
state: latest
- name: kubectl
apt:
name: kubectl
state: latest
# Dependencies for helm plugin
- name: git
apt:
name: git
state: latest
# Upgrade everything to latest (use full upgrade)
- name: Upgrade packages to the latest version
apt: upgrade=full update_cache=yes
- name: Free up disk space
command: apt clean
become: yes
The storage tasks configure the data disk that was provisioned with Terraform, mounts it, and prepares it to be used by Longhorn.
roles/k3s_cluster/prereq/tasks/storage.yml:
- name: "Get disk alignment for disks"
shell: |
if
[[ -e /sys/block/{{ storage_disk }}/queue/optimal_io_size && -e /sys/block/{{ storage_disk }}/alignment_offset && -e /sys/block/{{ storage_disk }}/queue/physical_block_size ]];
then
echo $[$(( ($(cat /sys/block/{{ storage_disk }}/queue/optimal_io_size) + $(cat /sys/block/{{ storage_disk }}/alignment_offset)) / $(cat /sys/block/{{ storage_disk }}/queue/physical_block_size) )) | 2048];
else
echo 2048;
fi
args:
executable: "/bin/bash"
register: disk_offset
- name: "Partition storage disk"
shell: |
if
[ -b /dev/{{ storage_disk }} ]
then
[ -b /dev/{{ storage_disk_part | default(storage_disk + "1") }} ] || parted -a optimal --script "/dev/{{ storage_disk }}" mklabel gpt mkpart primary {{ disk_offset.stdout|default("2048") }}s 100% && sleep 5 && partprobe /dev/{{ storage_disk }}; sleep 5
fi
args:
executable: "/bin/bash"
- name: "Create filesystem on the volume"
filesystem:
dev: '/dev/{{ storage_disk_part | default(storage_disk + "1") }}'
force: "no"
fstype: "{{ storage_filesystem }}"
- name: "Ensure the mount directory exists"
file:
path: "{{ storage_mountpoint }}"
owner: "root"
group: "root"
state: directory
- name: "Get UUID for volume"
command: 'blkid -s UUID -o value /dev/{{ storage_disk_part | default(storage_disk + "1") }}'
register: disk_blkid
changed_when: False
check_mode: no
- name: "Mount additional disk"
mount:
path: "{{ storage_mountpoint }}"
fstype: "{{ storage_filesystem }}"
passno: "0"
src: "UUID={{ disk_blkid.stdout }}"
state: "mounted"
- name: "Ensure the permissions are set correctly"
file:
path: "{{ storage_mountpoint }}"
owner: "root"
group: "root"
mode: "0755"
state: directory
roles/k3s_cluster/download/main.yml:
- name: Download k3s binary x64
get_url:
url: https://github.com/k3s-io/k3s/releases/download/{{ k3s_version }}/k3s
checksum: sha256:https://github.com/k3s-io/k3s/releases/download/{{ k3s_version }}/sha256sum-amd64.txt
dest: /usr/local/bin/k3s
owner: root
group: root
mode: 0755
when: ansible_facts.architecture == "x86_64"
- name: Download k3s binary arm64
get_url:
url: https://github.com/k3s-io/k3s/releases/download/{{ k3s_version }}/k3s-arm64
checksum: sha256:https://github.com/k3s-io/k3s/releases/download/{{ k3s_version }}/sha256sum-arm64.txt
dest: /usr/local/bin/k3s
owner: root
group: root
mode: 0755
when:
- ( ansible_facts.architecture is search("arm") and
ansible_facts.userspace_bits == "64" ) or
ansible_facts.architecture is search("aarch64")
- name: Download k3s binary armhf
get_url:
url: https://github.com/k3s-io/k3s/releases/download/{{ k3s_version }}/k3s-armhf
checksum: sha256:https://github.com/k3s-io/k3s/releases/download/{{ k3s_version }}/sha256sum-arm.txt
dest: /usr/local/bin/k3s
owner: root
group: root
mode: 0755
when:
- ansible_facts.architecture is search("arm")
- ansible_facts.userspace_bits == "32"
roles/inventory/group_vars/k3s_cluster:
timezone: "America/Chicago"
systemd_dir: /etc/systemd/system
storage_disk: "sdb"
storage_filesystem: "ext4"
storage_mountpoint: "/var/lib/longhorn"
k3s_version: v1.24.4+k3s1
master_ip: "{{ hostvars[groups['master'][0]]['ansible_host'] | default(groups['master'][0]) }}"
k3s_vip: xxx.xxx.xxx.xxx
k3s_token: long_password
k3s Master Role
The k3s master role is different from the worker node role in that it configures keepalived and a floating IP which will provide high availability to the Kubernetes API using the embedded etcd. The first start of the k3s service on the first node needs to initialize the cluster. Any additional master nodes need to be started, for the first time, to follow the first node. After the first start, the service is reconfigured to start without the init-cluster parameter. All master nodes will receive updates to the cluster configuration and whichever node keepalived decides should serve the floating IP will handle Kubernetes API calls.
In order to differentiate between a first-time run of the role and subsequent runs, the k3s-init-cluster.yml playbook uses set_fact to set k3s_bootstrap_cluster to true.
roles/k3s_cluster/k3s/master/tasks/main.yml:
---
- name: Configure Floating IP
ansible.builtin.template:
src: templates/k3s-vip.j2
dest: /etc/network/interfaces.d/60-k3s-vip-ip
owner: root
group: root
mode: "0644"
- name: network_restart
service:
name: "networking"
state: restarted
ignore_errors: yes
- name: Install keepalived
ansible.builtin.apt:
name: keepalived
state: latest
- name: Configure Keepalived
ansible.builtin.template:
src: templates/keepalived-conf.j2
dest: /etc/keepalived/keepalived.conf
owner: root
group: root
mode: "0644"
notify:
- keepalived_restart
- name: Bootstrap cluster
block:
- name: Copy K3s cluster bootstrap first service file
register: k3s_service
ansible.builtin.template:
src: "templates/k3s-bootstrap-first.service.j2"
dest: "{{ systemd_dir }}/k3s-bootstrap.service"
owner: root
group: root
mode: 0644
when: inventory_hostname == groups.master[0]
- name: Wait for service to start
ansible.builtin.pause:
seconds: 30
run_once: yes
- name: Copy K3s cluster bootstrap followers service file
register: k3s_service
ansible.builtin.template:
src: "templates/k3s-bootstrap-followers.service.j2"
dest: "{{ systemd_dir }}/k3s-bootstrap.service"
owner: root
group: root
mode: 0644
when: inventory_hostname != groups.master[0]
- name: Start K3s service bootstrap /1
ansible.builtin.systemd:
name: k3s-bootstrap
daemon_reload: yes
enabled: no
state: started
delay: 3
register: result
retries: 3
until: result is not failed
when: ansible_hostname == groups.master[0]
- name: Wait for service to start
ansible.builtin.pause:
seconds: 30
run_once: yes
- name: Start K3s service bootstrap /2
ansible.builtin.systemd:
name: k3s-bootstrap
daemon_reload: yes
enabled: no
state: started
delay: 3
register: result
retries: 3
until: result is not failed
when: ansible_hostname != groups.master[0]
- name: Wait for service to start
ansible.builtin.pause:
seconds: 30
run_once: yes
- name: Stop K3s service bootstrap
systemd:
name: k3s-bootstrap
daemon_reload: no
enabled: no
state: stopped
- name: Remove K3s service bootstrap
file:
path: /etc/systemd/system/k3s-bootstrap.service
state: absent
when: k3s_boostrap_cluster
- name: Copy K3s cluster primary service file
register: k3s_service
ansible.builtin.template:
src: "k3s.service.j2"
dest: "{{ systemd_dir }}/k3s.service"
owner: root
group: root
mode: 0644
when: inventory_hostname == groups.master[0]
- name: Copy K3s cluster primary service file
register: k3s_service
ansible.builtin.template:
src: "k3s.service.j2"
dest: "{{ systemd_dir }}/k3s.service"
owner: root
group: root
mode: 0644
when: inventory_hostname != groups.master[0]
- name: Copy K3s service environment file
register: k3s_service_env
template:
src: "k3s.service.env.j2"
dest: "{{ systemd_dir }}/k3s.service.env"
owner: root
group: root
mode: 0644
- name: Enable and check K3s service
systemd:
name: k3s
daemon_reload: yes
state: restarted
enabled: yes
- name: Wait for node-token
wait_for:
path: /var/lib/rancher/k3s/server/node-token
run_once: yes
- name: Register node-token file access mode
stat:
path: /var/lib/rancher/k3s/server
register: p
run_once: yes
- name: Change file access node-token
file:
path: /var/lib/rancher/k3s/server
mode: "g+rx,o+rx"
run_once: yes
- name: Read node-token from master
slurp:
src: /var/lib/rancher/k3s/server/node-token
register: node_token
run_once: yes
- name: Store Master node-token
set_fact:
token: "{{ node_token.content | b64decode | regex_replace('\n', '') }}"
run_once: yes
- name: Restore node-token file access
file:
path: /var/lib/rancher/k3s/server
mode: "{{ p.stat.mode }}"
run_once: yes
- name: Create directory .kube
file:
path: ~{{ ansible_user }}/.kube
state: directory
owner: "{{ ansible_user }}"
mode: "u=rwx,g=rx,o="
- name: Change k3s.yaml permissions to 644
file:
path: /etc/rancher/k3s/k3s.yaml
owner: "{{ ansible_user }}"
mode: "644"
- name: Replace https://localhost:6443 by https://master-vip:6443
command: >-
k3s kubectl config set-cluster default
--server=https://{{ master_vip }}:6443
--kubeconfig ~{{ ansible_user }}/.kube/config
changed_when: true
- name: Create kubectl symlink
file:
src: /usr/local/bin/k3s
dest: /usr/local/bin/kubectl
state: link
- name: Create crictl symlink
file:
src: /usr/local/bin/k3s
dest: /usr/local/bin/crictl
state: link
- name: check if helm is installed /usr/local/bin/helm
stat:
path: /usr/local/bin/helm
register: helm_check
- name: Download get-helm-3
get_url:
url: https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3
dest: ~/get-helm-3.sh
mode: "700"
when: not helm_check.stat.exists
- name: install helm if not exist
command: >-
~/get-helm-3.sh
when: not helm_check.stat.exists
changed_when: true
- name: Install Helm diff plugin
kubernetes.core.helm_plugin:
plugin_path: https://github.com/databus23/helm-diff
state: present
roles/k3s_cluster/k3s/master/templates/k3s-bootstrap-first.service.j2:
[Unit]
Description=Lightweight Kubernetes
Documentation=https://k3s.io
Wants=network-online.target
After=network-online.target
[Install]
WantedBy=multi-user.target
[Service]
Type=notify
EnvironmentFile=-/etc/default/%N
EnvironmentFile=-/etc/sysconfig/%N
EnvironmentFile=-/etc/systemd/system/k3s.service.env
ExecStartPre=/bin/sh -xc '! /usr/bin/systemctl is-enabled --quiet nm-cloud-setup.service'
ExecStartPre=-/sbin/modprobe br_netfilter
ExecStartPre=-/sbin/modprobe overlay
ExecStart=/usr/local/bin/k3s server --cluster-init --token {{ k3s_token }} {{ extra_server_args | default("") }}
KillMode=process
Delegate=yes
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNOFILE=1048576
LimitNPROC=infinity
LimitCORE=infinity
TasksMax=infinity
TimeoutStartSec=0
Restart=always
RestartSec=5s
roles/k3s_cluster/k3s/master/templates/k3s-bootstrap-followers.service.j2:
[Unit]
Description=Lightweight Kubernetes
Documentation=https://k3s.io
Wants=network-online.target
After=network-online.target
[Install]
WantedBy=multi-user.target
[Service]
Type=notify
EnvironmentFile=-/etc/default/%N
EnvironmentFile=-/etc/sysconfig/%N
EnvironmentFile=-/etc/systemd/system/k3s.service.env
ExecStartPre=/bin/sh -xc '! /usr/bin/systemctl is-enabled --quiet nm-cloud-setup.service'
ExecStartPre=-/sbin/modprobe br_netfilter
ExecStartPre=-/sbin/modprobe overlay
ExecStart=/usr/local/bin/k3s server --server https://{{ master_ip }}:6443 --token {{ k3s_token }} {{ extra_server_args | default("") }}
KillMode=process
Delegate=yes
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNOFILE=1048576
LimitNPROC=infinity
LimitCORE=infinity
TasksMax=infinity
TimeoutStartSec=0
Restart=always
RestartSec=5s
roles/k3s_cluster/k3s/master/templates/k3s-vip.j2:
auto eth0:1
iface eth0:1 inet static
address {{ master_vip }}/32
roles/k3s_cluster/k3s/master/templates/k3s.service.env.j2 is currently empty, but could be deployed in case I want to set additional environment variables for the service.
roles/k3s_cluster/k3s/master/templates/k3s.service.j2:
[Unit]
Description=Lightweight Kubernetes
Documentation=https://k3s.io
Wants=network-online.target
After=network-online.target
[Install]
WantedBy=multi-user.target
[Service]
Type=notify
EnvironmentFile=-/etc/default/%N
EnvironmentFile=-/etc/sysconfig/%N
EnvironmentFile=-/etc/systemd/system/k3s.service.env
ExecStartPre=/bin/sh -xc '! /usr/bin/systemctl is-enabled --quiet nm-cloud-setup.service'
ExecStartPre=-/sbin/modprobe br_netfilter
ExecStartPre=-/sbin/modprobe overlay
ExecStart=/usr/local/bin/k3s server {{ extra_server_args | default("") }}
KillMode=process
Delegate=yes
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNOFILE=1048576
LimitNPROC=infinity
LimitCORE=infinity
TasksMax=infinity
TimeoutStartSec=0
Restart=always
RestartSec=5s
roles/k3s_cluster/k3s/master/templates/keepalive-conf.j2:
global_defs {
script_user root
default_interface eth0
enable_script_security
}
vrrp_script apiserver {
script "/usr/bin/curl -s -k https://localhost:6443/healthz -o /dev/null"
interval 20
timeout 5
rise 1
fall 1
user root
}
vrrp_instance VI_1 {
state BACKUP
interface eth0
virtual_router_id 50
priority 50
advert_int 1
authentication {
auth_type PASS
auth_pass 9998
}
track_script {
apiserver
}
virtual_ipaddress {
{{ master_vip }} label eth0:VIP
}
}
roles/k3s_cluster/k3s/master/handlers/main.yml:
- name: network_restart
service:
name: "networking"
state: restarted
ignore_errors: yes
- name: keepalived_restart
service:
name: "keepalived"
state: restarted
ignore_errors: yes
k3s Worker Node Role
The worker node role is similar to the master node, but without all of the extra initialization steps. It simply starts the k3s agent using the floating IP of the cluster master.
roles/k3s_cluster/k3s/node/tasks/main.yml:
---
- name: Copy K3s service file
template:
src: "k3s.service.j2"
dest: "{{ systemd_dir }}/k3s-node.service"
owner: root
group: root
mode: 0644
- name: Copy K3s service environment file
register: k3s_service
template:
src: "k3s.service.env.j2"
dest: "{{ systemd_dir }}/k3s.service.env"
owner: root
group: root
mode: 0644
- name: Enable and check K3s service
systemd:
name: k3s-node
daemon_reload: yes
state: restarted
enabled: yes
roles/k3s_cluster/k3s/node/templates/k3s-node.service.j2:
[Unit]
Description=Lightweight Kubernetes
Documentation=https://k3s.io
Wants=network-online.target
After=network-online.target
[Install]
WantedBy=multi-user.target
[Service]
Type=notify
EnvironmentFile=-/etc/default/%N
EnvironmentFile=-/etc/sysconfig/%N
EnvironmentFile=-/etc/systemd/system/k3s.service.env
ExecStartPre=/bin/sh -xc '! /usr/bin/systemctl is-enabled --quiet nm-cloud-setup.service'
ExecStartPre=-/sbin/modprobe br_netfilter
ExecStartPre=-/sbin/modprobe overlay
ExecStart=/usr/local/bin/k3s agent --server https://{{ master_vip }}:6443 --token {{ k3s_token }} {{ extra_agent_args | default("") }}
KillMode=process
Delegate=yes
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNOFILE=1048576
LimitNPROC=infinity
LimitCORE=infinity
TasksMax=infinity
TimeoutStartSec=0
Restart=always
RestartSec=5s
Just like the master role, roles/k3s_cluster/k3s/node/templates/k3s-node.service.env.j2 is currently empty.
Playbooks
There are two playbooks used with this role. The k3s-init-cluster.yml playbook is run once right after the VMs are provisioned to set up the initial cluster. The k3s-cluster.yml playbook is identical except that k3s_boostrap_cluster is set to false and can be run for any subsequent changes needed to the cluster nodes. Both playbooks create a .kube directory in the ansible repository to allow command line access (kubectl and helm) from the ansible host.
k3s-init-cluster.yml:
---
- hosts: k3s_cluster
gather_facts: yes
become: yes
roles:
- role: debian-cloud
- role: debian
- role: k3s_cluster/prereq
- role: k3s_cluster/download
- hosts: master
remote_user: ansible
become: yes
vars:
ansible_python_interpreter: /usr/bin/python3
pre_tasks:
- name: Install Kubernetes Python module
pip:
name: kubernetes
- name: Install Kubernetes-validate Python module
pip:
name: kubernetes-validate
- name : Set initialize cluster gather_facts
set_fact:
k3s_boostrap_cluster: true
roles:
- role: k3s_cluster/k3s/master
- hosts: node
remote_user: ansible
become: yes
roles:
- role: k3s_cluster/k3s/node
- hosts: master
remote_user: ansible
become: yes
tasks:
- name: Fetch ca cert
ansible.builtin.fetch:
src: /var/lib/rancher/k3s/server/tls/server-ca.crt
dest: .kube/k3s/server-ca.crt
flat: yes
run_once: yes
- name: Fetch admin cert
ansible.builtin.fetch:
src: /var/lib/rancher/k3s/server/tls/client-admin.crt
dest: .kube/k3s/client-admin.crt
flat: yes
run_once: yes
- name: Fetch admin key
ansible.builtin.fetch:
src: /var/lib/rancher/k3s/server/tls/client-admin.key
dest: .kube/k3s/client-admin.key
flat: yes
run_once: yes
- name: Configure k3s cluster
command: >-
kubectl config --kubeconfig=.kube/config-k3s set-cluster k3s
--server=https://{{ master_vip }}:6443
--certificate-authority=.kube/k3s/server-ca.crt
run_once: yes
become: no
delegate_to: localhost
- name: Configure k3s user
command: >-
kubectl config --kubeconfig=.kube/config-k3s set-credentials k3s-admin
--client-certificate=.kube/k3s/client-admin.crt
--client-key=.kube/k3s/client-admin.key
run_once: yes
become: no
delegate_to: localhost
- name: Configure k3s context
command: >-
kubectl config --kubeconfig=.kube/config-k3s set-context k3s
--cluster=k3s
--namespace=default
--user=k3s-admin
run_once: yes
become: no
delegate_to: localhost
- hosts: master
become: yes
remote_user: ansible
vars:
ansible_python_interpreter: /usr/bin/python3
pre_tasks:
- name: Install Kubernetes Python module
pip:
name: kubernetes
- name: Install Kubernetes-validate Python module
pip:
name: kubernetes-validate
roles:
- role: k3s_cluster/metallb
- role: k3s_cluster/longhorn
- role: k3s_cluster/traefik
run_once: yes
Next steps
The last task in the playbook executes the metallb, longhorn, and traefik roles. These roles deploy and/or configure the cluster load balancer, cluster storage, and cluster ingress controller. I will cover those in detail in my next post.