Implement HA control plane with Load Balancer (3-3 topology)
Some checks failed
Deploy Cluster / Terraform (push) Failing after 10s
Deploy Cluster / Ansible (push) Has been skipped

Major changes:
- Terraform: Scale to 3 control planes (cx23) + 3 workers (cx33)
- Terraform: Add Hetzner Load Balancer (lb11) for Kubernetes API
- Terraform: Add kube_api_lb_ip output
- Ansible: Add community.network collection to requirements
- Ansible: Update inventory to include LB endpoint
- Ansible: Configure secondary CPs and workers to join via LB
- Ansible: Add k3s_join_endpoint variable for HA joins
- Workflow: Add imports for cp-2, cp-3, and worker-3
- Docs: Update STABLE_BASELINE.md with HA topology and phase gates

Topology:
- 3 control planes (cx23 - 2 vCPU, 8GB RAM each)
- 3 workers (cx33 - 4 vCPU, 16GB RAM each)
- 1 Load Balancer (lb11) routing to all 3 control planes on port 6443
- Workers and secondary CPs join via LB endpoint for HA

Cost impact: +~€26/month (2 extra CPs + 1 extra worker + LB)
This commit is contained in:
2026-03-23 02:39:39 +00:00
parent 8b4a445b37
commit ff31cb4e74
10 changed files with 89 additions and 21 deletions

43
terraform/loadbalancer.tf Normal file
View File

@@ -0,0 +1,43 @@
# Load Balancer for Kubernetes API High Availability
# Provides a single endpoint for all control planes
resource "hcloud_load_balancer" "kube_api" {
name = "${var.cluster_name}-api"
load_balancer_type = "lb11" # Cheapest tier: €5.39/month
location = var.location
labels = {
cluster = var.cluster_name
role = "kube-api"
}
}
# Attach all control plane servers as targets
resource "hcloud_load_balancer_target" "kube_api_targets" {
count = var.control_plane_count
type = "server"
load_balancer_id = hcloud_load_balancer.kube_api.id
server_id = hcloud_server.control_plane[count.index].id
use_private_ip = true
depends_on = [hcloud_server.control_plane]
}
# Kubernetes API service on port 6443
resource "hcloud_load_balancer_service" "kube_api" {
load_balancer_id = hcloud_load_balancer.kube_api.id
protocol = "tcp"
listen_port = 6443
destination_port = 6443
health_check {
protocol = "tcp"
port = 6443
interval = 15
timeout = 10
retries = 3
}
}
# Firewall rule to allow LB access to control planes on 6443
# This is added to the existing cluster firewall