Longhorn provides block strorage for a Kubernetes cluster which is provisioned and managed with containers and microservices. It manages the disk devices on the nodes and creates a pool for Kubernetes persistent volumes (PVs) which are replicated and distributed across the nodes in the cluster. This means that a single node or disk failure doesn't take down the volume and it's accessible by other nodes. It also supports snapshots and backups to external storage such as NFS and S3 buckets.
Since I'm already using Rancher for managing my cluster, it was a straight-forward process to install as a catalog app. I created a new Rancher project called "Storage" with a namespace called "longhorn-system" in which to install it.
This process sets up the system itself incuding the storage provisioner so that persistent volume requests can be fulfilled by the longhorn storage class. In order to configure the storage itself and create volumes manually, I needed to create an ingress to the Longhorn web interface. It uses basic authentication, so following the instructions I created an auth file using:
Then create the secret from the file:
Following my previous pattern, I created an Ansible role which deploys the ingress manifest.
I was able to login in to the Longhorn UI, set up the storage pool, and manually create a volume. I wanted to test things out and investigate how I can migrate existing persistent volumes from my current provisioner (nfs-provisioner-nfs-subdir-external-provisioner) to Longhorn volumes.
The answer I found was a batch job which would mount the old and new volumes and perform a copy. I used my Valheim server persistent volume as a test.
piVersion: batch/v1 kind: Job metadata: namespace: games name: volume-migration-valheim spec: completions: 1 parallelism: 1 backoffLimit: 3 template: metadata: name: volume-migration labels: name: volume-migration spec: restartPolicy: Never containers: - name: volume-migration image: ubuntu:xenial tty: true command: [ "/bin/sh" ] args: [ "-c", "cp -r -v /mnt/old /mnt/new" ] volumeMounts: - name: old-vol mountPath: /mnt/old - name: new-vol mountPath: /mnt/new volumes: - name: old-vol persistentVolumeClaim: claimName: valheim-server-data - name: new-vol persistentVolumeClaim: claimName: valheim-test-migration
Unforunately, this did not work as expected. The container that the job was running on was not able to mount the new volume. I had noted that Longhorn said it did not support RancherOS since it's end-of-life, but I saw some things that made think it might work if you a) switch to the Ubuntu console, b) install open-iscsi package, and c) install nfs-common package.
It does work in the sense that it's able to provision the volume, but it doesn't seem to be able to mount volumes across nodes via NFS in the container as it's missing some underlying support in the OS. For those not familiar, RancherOS runs entirely as Docker containers and there are actually two Docker engines running (system-docker and userland docker) so the "OS" is actually a container as well.
Unfortunately, this effort is on hold for now. My plan is to add addtional nodes to the cluster which are standard Debian VMs running Docker using Terraform and Proxmox. The thought is to add a node, migrate workloads, then retire the old RancherOS nodes without needing to create an entirely new cluster. I'm only assuming this can be done as I described, I haven't been able to find anyone else who has done this before and documented for me to DDG.
When I was first setting up my cluster, I limited the Kubernetes Ingress NGINX controller (not to be confused with the NGINX Ingress Controller!) to run on a single node and just did a port forward to that node's IP address in my router. The Ingress then routed the traffic to the correct node/pod inside of the cluster to respond to the request. The problem is that this introduced a single point-of-failure. This was largely mitigated by running nodes as VMs on a Proxmox cluster, but still.
Normally, a cloud provider would have a Load Balancer service which would proivide the external IP address and route traffic to the correct port and node. Kubernetes doesn't provide this for a bare metal cluster, but MetalLB provides an elegant solution.
In my case, I have a dedicated DMZ VLAN on my home network and I was able to exclude a range of IP addresses from being given out by DHCP. This is the pool of addresses that I configured for layer 2 load balancing.
Again, following my pattern of using Ansible roles to deploy:
The namespace.yaml and metallb.yaml manifests come directly from the MetalLB Github repo so I only needed to create a configmap specific to my environment:
The last step was to create a Load Balancer for the Ingress controller (converting from NodePort configuration):
apiVersion: v1 kind: Service spec: clusterIP: 10.43.x.x clusterIPs: - 10.43.x.x externalTrafficPolicy: Local healthCheckNodePort: 31431 loadBalancerIP: 192.168.x.5 ports: - name: "80" nodePort: 30176 port: 80 protocol: TCP targetPort: 80 - name: "443" nodePort: 32182 port: 443 protocol: TCP targetPort: 443 selector: workloadID_ingress-nginx: "true" sessionAffinity: None type: LoadBalancer
One important bit is that the externalTrafficPolicy is set to local which preserves the source IP of the request. Otherwise, it would show the IP of the node which originally received the request. Another important bit is that I'm requesting a specific IP address from the pool for the Load Balancer so that I can port forward to that IP in my router. Otherwise, it would just assign the next available IP address in the pool.
So, what is going on here?
One of the nodes in the cluster will advertise itself as the load balancer IP address and will distribute the requests to the ingress controller on the same node. If that nodes goes down, another node will pick up the slack and start advertising itself instead. So layer 2 is not really load balancing, but merely a failover and there can be a delay before the new node starts sending out the ARPs. In any case, I can get traffic into the cluster and it's nice to have another layer of redundancy.