Update STABLE_BASELINE.md - CCM/CSI integration achieved
All checks were successful
Deploy Cluster / Terraform (push) Successful in 31s
Deploy Cluster / Ansible (push) Successful in 3m36s

Document the successful completion of Hetzner CCM and CSI integration:
- CCM deployed via Ansible before workers join (fixes uninitialized taint)
- CSI provides hcloud-volumes StorageClass for persistent storage
- Two consecutive rebuilds passed all phase gates
- PVC provisioning tested and working

Platform now has full cloud provider integration with persistent volumes.
This commit is contained in:
2026-03-23 02:25:00 +00:00
parent e447795395
commit 8b4a445b37

View File

@@ -12,15 +12,16 @@ This document defines the current engineering target for this repository.
## In Scope
- Terraform infrastructure bootstrap
- Ansible k3s bootstrap (using k3s embedded cloud provider)
- Ansible k3s bootstrap with external cloud provider
- **Hetzner CCM deployed via Ansible (before workers join)**
- **Hetzner CSI for persistent volumes (via Flux)**
- Flux core reconciliation
- External Secrets Operator with Doppler
- Tailscale private access
- Persistent volume provisioning validated
## Deferred for Later Phases
- Hetzner CCM (using k3s embedded for now)
- Hetzner CSI (deferred - local storage sufficient for baseline)
- Observability stack (deferred - complex helm release needs separate debugging)
## Out of Scope
@@ -35,15 +36,20 @@ This document defines the current engineering target for this repository.
## Phase Gates
1. Terraform apply completes for the default topology.
2. k3s server bootstrap completes and kubeconfig works.
3. Workers join and all nodes are Ready.
4. Flux source and infrastructure reconciliation are healthy.
5. External Secrets sync required secrets.
6. Tailscale private access works.
7. Terraform destroy succeeds cleanly or via workflow retry.
_Note: Hetzner CCM, CSI, and Observability are suspended for the stable baseline phase. Core platform only._
2. k3s server bootstrap completes with external cloud provider enabled.
3. **CCM deployed via Ansible before workers join** (fixes uninitialized taint issue).
4. Workers join successfully and all nodes show proper `providerID`.
5. Flux source and infrastructure reconciliation are healthy.
6. **CSI deploys and creates `hcloud-volumes` StorageClass**.
7. **PVC provisioning tested and working** (validated with test pod).
8. External Secrets sync required secrets.
9. Tailscale private access works.
10. Terraform destroy succeeds cleanly or via workflow retry.
## Success Criteria
The baseline is considered stable only after two consecutive fresh rebuilds pass all phase gates with no manual fixes.
**ACHIEVED** - Two consecutive fresh rebuilds passed all phase gates with no manual fixes:
- Build 1: Initial CCM/CSI deployment and validation (2026-03-23)
- Build 2: Full destroy/rebuild cycle successful (2026-03-23)
The platform is now stable with cloud provider integration and persistent volume support.