Deploy CCM via Ansible before workers join to fix external cloud provider
Some checks failed
Deploy Cluster / Terraform (push) Successful in 31s
Deploy Cluster / Ansible (push) Failing after 1m48s

This fixes the chicken-and-egg problem where workers with
--kubelet-arg=cloud-provider=external couldn't join because CCM wasn't
running yet to remove the node.cloudprovider.kubernetes.io/uninitialized taint.

Changes:
- Create ansible/roles/ccm-deploy/ to deploy CCM via Helm during Ansible phase
- Reorder site.yml: CCM deploys after secrets but before workers join
- CCM runs on control_plane[0] with proper tolerations for control plane nodes
- Add 10s pause after CCM ready to ensure it can process new nodes
- Workers can now successfully join with external cloud provider enabled

Flux still manages CCM for updates, but initial install happens in Ansible.
This commit is contained in:
2026-03-22 23:58:03 +00:00
parent cadfedacf1
commit 31b82c9371
2 changed files with 76 additions and 7 deletions

View File

@@ -49,6 +49,20 @@
dest: ../outputs/kubeconfig
flat: true
- name: Bootstrap addon prerequisite secrets
hosts: control_plane[0]
become: true
roles:
- addon-secrets-bootstrap
- name: Deploy Hetzner CCM (required for workers with external cloud provider)
hosts: control_plane[0]
become: true
roles:
- ccm-deploy
- name: Setup secondary control planes
hosts: control_plane[1:]
become: true
@@ -75,13 +89,6 @@
roles:
- k3s-agent
- name: Bootstrap addon prerequisite secrets
hosts: control_plane[0]
become: true
roles:
- addon-secrets-bootstrap
- name: Deploy observability stack
hosts: control_plane[0]
become: true