Towards High Availability
In order to make my new 2 node Proxmox cluster highly available, I need shared storage for the VMs and a quorum in the cluster.
Shared storage is available now as an NFS mount from the QNAP, but my goal is to retire the QNAP and move two TB disks into the first Proxmox node.
There are a number of ways to do this, but I to chose to use GlusterFS volumes backed by ZFS. ZFS can be used to provide the datasets for the Gluster bricks to reside and the volumes can be replicated across both nodes. PVE can directly use Gluster volumes as storage.
The Proxmox Ansible playbook I created for this does these tasks:
- Install GlusterFS packages
- Creates a nodelist from the hosts in the playbook run
- Configure the Gluster peers
- Starts the Gluster service
- Create the parent ZFS dataset
- For each volume, create a brick dataset
- Configure the volume
- name: Proxmox
hosts:
- node1
- node2
tasks:
- name: Install GlusterFS tools
apt:
name: glusterfs-server
state: latest
- name: Determine Gluster nodes
set_fact:
nodelist: "{{ ( nodelist | default([]) ) + [ hostvars[item].ansible_host ] }}"
loop: "{{ ansible_play_hosts }}"
- debug:
var: nodelist | join(',')
- name: Enable Glusterd service
ansible.builtin.systemd:
name: glusterd
state: started
enabled: yes
- name: Create a Gluster storage pool
gluster.gluster.gluster_peer:
state: present
nodes: "{{ nodelist }}"
run_once: true
- name: Create bricks parent zfs dataset
community.general.zfs:
name: "{{ gluster_zpool }}/bricks"
extra_zfs_properties:
mountpoint: /mnt/bricks
state: present
- name: Create brick - b1
community.general.zfs:
name: "{{ gluster_zpool }}/bricks/b1"
state: present
- name: Create gluster test volume
gluster.gluster.gluster_volume:
state: present
name: test
bricks: /mnt/bricks/b1/test
replicas: 2
cluster: "{{ nodelist }}"
run_once: true
- name: create ISO volume
gluster.gluster.gluster_volume:
state: present
name: iso
bricks: /mnt/bricks/b1/iso12
replicas: 2
cluster: "{{ nodelist }}"
run_once: true
- name: Create brick - b2
community.general.zfs:
name: "{{ gluster_zpool }}/bricks/b2"
state: present
- name: create replicated VM data
gluster.gluster.gluster_volume:
state: present
name: vmdata-replicated
bricks: /mnt/bricks/b2/vmdata-replicated
replicas: 2
cluster: "{{ nodelist }}"
run_once: true
Once the Gluster volumes are created, they can be added to the Proxmox Storage Manager. In my case, I created a test volume, a volume for storing ISO images, and a volume for storing VM disk and container templates. Both nodes in the cluster can access these replicated volumes so the VMs can run in either node.
I’m in the process of moving everything stored in local storage to one of these shared volumes.
QDevice
In order to gain the 3rd vote for the Quorum, I decided to use a Raspberry Pi I had sitting around doing nothing to run corosync.
Once the Pi was available via SSH and root could login with a password, I needed to install the corosync-qdevice package on all Proxmox nodes as well as the Pi.
- name: Install Corosync Qdevice tools
apt:
name: corosync-qdevice
state: latest
Then from one of the nodes, execute:
# pvecm qdevice setup 192.168.1.x
The command will prompt for the root password on the Qdevice and will install the cluster certificates. Now check the cluster status:
# pvecm status
Cluster information
-------------------
Name: home
Config Version: 3
Transport: knet
Secure auth: on
Quorum information
------------------
Date: Sat Jul 17 10:14:07 2021
Quorum provider: corosync_votequorum
Nodes: 2
Node ID: 0x00000001
Ring ID: 1.30
Quorate: Yes
Votequorum information
----------------------
Expected votes: 3
Highest expected: 3
Total votes: 3
Quorum: 2
Flags: Quorate Qdevice
Membership information
----------------------
Nodeid Votes Qdevice Name
0x00000001 1 A,V,NMW node1 (local)
0x00000002 1 A,V,NMW node2
0x00000000 1 Qdevice
Should one of the nodes go down, the Qdevice will provide enough votes to maintain a quorum.