Toy highly-available Kubernetes cluster on NixOS
MIT License
A recipe for a cluster of virtual machines managed by Terraform, running a highly-available Kubernetes cluster, deployed on NixOS using Colmena.
NixOS provides a Kubernetes module, which is capable of running a master
or worker
node.
The module even provides basic PKI, making running simple clusters easy.
However, HA support is limited (see, for example,
this comment
and an empty section
for "N masters" in NixOS wiki).
This project serves as an example of using the NixOS Kubernetes module in an advanced way, setting up a cluster that is highly-available on all levels.
External etcd topology, as described by Kubernetes docs, is implemented. The cluster consists of:
etcd
nodescontrolplane
nodes, runningkube-apiserver
, kube-controller-manager
, and kube-scheduler
.worker
nodes, running kubelet
, kube-proxy
,coredns
, and a CNI network (currently flannel
).loadbalancer
nodes, running keepalived
and haproxy
,kubectl apply -f foo.yaml
invocations required to get a functional cluster.k get pods -A
after the cluster is spun up lists zero pods.{
virtualisation.libvirtd.enable = true;
users.users."yourname".extraGroups = [ "libvirtd" ];
}
10.240.0.0/24
IPv4 subnet available (as in, not used for your home network or similar).$ nix-shell
$ make-boot-image # Build the base NixOS image to boot VMs from
$ ter init # Initialize terraform modules
$ ter apply # Create the virtual machines
$ make-certs # Generate TLS certificates for Kubernetes, etcd, and other daemons.
$ colmena apply # Deploy to your cluster
Most of the steps can take several minutes each when running for the first time.
$ ./check.sh # Prints out diagnostic information about the cluster and tries to run a simple pod.
$ k run --image nginx nginx # Run a simple pod. `k` is an alias of `kubectl` that uses the generated admin credentials.
The number of servers of each role can be changed by editing terraform.tfvars
and issuing the following commands afterwards:
$ ter apply # Spin up or spin down machines
$ make-certs # Regenerate the certs, as they are tied to machine IPs/hostnames
$ colmena apply # Redeploy
$ ter destroy # Destroy the virtual machines
$ rm boot/image # Destroy the base image
.ssh/known_hosts
:g/^10.240.0./d
in Vim to clean it up.sed
or similar software of your choice.Contributions are welcome, although I might reject any that conflict with the project goals. See TODOs in the repo for some rough edges you could work on.
Make sure the ci-lint
script succeeds.
Make sure the check.sh
script succeeds after a deploying a fresh cluster.
Both Kubernetes The Hard Way and Kubernetes The Hard Way on Bare Metal helped me immensely in this project.