Scott's Weblog The weblog of an IT pro focusing on cloud computing, Kubernetes, Linux, containers, and networking

Pausing Cluster API Reconciliation

Cluster API is a topic I’ve discussed here in a number of posts. If you’re not already familiar with Cluster API (also known as CAPI), I’d encourage you to check out my introductory post on Cluster API first; you can also visit the official Cluster API site for more details. In this short post, I’m going to show you how to pause the reconciliation of Cluster API cluster objects, a task that may be necessary for a variety of reasons (including backing up the Cluster API objects in your management cluster).

Since CAPI leverages Kubernetes-style APIs to manage Kubernetes cluster lifecycle, the idea of reconciliation is very important—it’s a core Kubernetes concept that isn’t at all specific to CAPI. This article on level triggering and reconciliation in Kubernetes is a great article that helps explain reconciliation, as well as a lot of other key concepts about how Kubernetes works.

When reconciliation is active, the controllers involved in CAPI are constantly evaluating desired state and actual state, and then reconciling differences between the two. There may be times when you need to pause this reconciliation loop. Fortunately, CAPI makes this pretty easy: there is a paused field that allows users to pause the reconciliation loop (see here in the v1alpha3 CAPI reference).

This field is optional, which you can observe using kubectl with an existing workload cluster. For example, here is the relevant output from kubectl get cluster workload1 against a CAPI management cluster; note the paused field is not present (i.e., reconciliation is currently active):

spec:
  clusterNetwork:
    pods:
      cidrBlocks:
      - 192.168.0.0/16
  controlPlaneEndpoint:
    host: workload1-apiserver-1234567890.us-east-2.elb.amazonaws.com
    port: 6443
  controlPlaneRef:
    apiVersion: controlplane.cluster.x-k8s.io/v1alpha3
    kind: KubeadmControlPlane
    name: workload1-control-plane
    namespace: workload1
  infrastructureRef:
    apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
    kind: AWSCluster
    name: workload1
    namespace: workload1

Reconcilation is active when the field isn’t present; setting the field to false is essentially the same as removing the field. To pause reconciliation, set it to true. One way of doing this is using kubectl patch, like this:

kubectl path cluster workload1 --type merge \
-p '{"spec":{"paused": true}}'

Naturally, you’d want to change workload1 to the name of the workload cluster for which you want reconciliation paused. After running the command, you can then run kubectl get cluster <cluster-name> -o yaml and verify the paused field is present:

spec:
  clusterNetwork:
    pods:
      cidrBlocks:
      - 192.168.0.0/16
  controlPlaneEndpoint:
    host: workload1-apiserver-1234567890.us-east-2.elb.amazonaws.com
    port: 6443
  controlPlaneRef:
    apiVersion: controlplane.cluster.x-k8s.io/v1alpha3
    kind: KubeadmControlPlane
    name: workload1-control-plane
    namespace: workload1
  infrastructureRef:
    apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
    kind: AWSCluster
    name: workload1
    namespace: workload1
  paused: true

If you check the logs for the CAPI controllers, you’ll see messages like this:

I1125 17:31:40.287717       1 machine_controller.go:170] controllers/Machine "msg"="Reconciliation is paused for this object" "machine"="workload1-md-0-58c6df55bc-cqt86" "namespace"="workload1"

When reconciliation is paused, the CAPI controllers will not make changes to the workload cluster in order to reconcile differences between actual state and desired state.

When you’re ready to resume reconciliation again, just change the field to false:

kubectl path cluster workload1 --type merge \
-p '{"spec":{"paused": false}}'

This will remove the field from the spec, allowing the CAPI controllers to resume reconciliation of the CAPI-related objects (like Clusters, Machines, and MachineDeployments).

As I mentioned at the start of this post, pausing CAPI reconciliation may be necessary in a few different situations. (Backing up Cluster API is one situation.)

If anyone has any questions, or if you spot an error in this post, please let me know. Hit me on Twitter; I’d love to hear from you!

Metadata and Navigation

Be social and share this post!