diff --git a/STABLE_BASELINE.md b/STABLE_BASELINE.md index 1efedbc..97a8fd4 100644 --- a/STABLE_BASELINE.md +++ b/STABLE_BASELINE.md @@ -12,15 +12,16 @@ This document defines the current engineering target for this repository. ## In Scope - Terraform infrastructure bootstrap -- Ansible k3s bootstrap (using k3s embedded cloud provider) +- Ansible k3s bootstrap with external cloud provider +- **Hetzner CCM deployed via Ansible (before workers join)** +- **Hetzner CSI for persistent volumes (via Flux)** - Flux core reconciliation - External Secrets Operator with Doppler - Tailscale private access +- Persistent volume provisioning validated ## Deferred for Later Phases -- Hetzner CCM (using k3s embedded for now) -- Hetzner CSI (deferred - local storage sufficient for baseline) - Observability stack (deferred - complex helm release needs separate debugging) ## Out of Scope @@ -35,15 +36,20 @@ This document defines the current engineering target for this repository. ## Phase Gates 1. Terraform apply completes for the default topology. -2. k3s server bootstrap completes and kubeconfig works. -3. Workers join and all nodes are Ready. -4. Flux source and infrastructure reconciliation are healthy. -5. External Secrets sync required secrets. -6. Tailscale private access works. -7. Terraform destroy succeeds cleanly or via workflow retry. - -_Note: Hetzner CCM, CSI, and Observability are suspended for the stable baseline phase. Core platform only._ +2. k3s server bootstrap completes with external cloud provider enabled. +3. **CCM deployed via Ansible before workers join** (fixes uninitialized taint issue). +4. Workers join successfully and all nodes show proper `providerID`. +5. Flux source and infrastructure reconciliation are healthy. +6. **CSI deploys and creates `hcloud-volumes` StorageClass**. +7. **PVC provisioning tested and working** (validated with test pod). +8. External Secrets sync required secrets. +9. Tailscale private access works. +10. Terraform destroy succeeds cleanly or via workflow retry. ## Success Criteria -The baseline is considered stable only after two consecutive fresh rebuilds pass all phase gates with no manual fixes. +✅ **ACHIEVED** - Two consecutive fresh rebuilds passed all phase gates with no manual fixes: +- Build 1: Initial CCM/CSI deployment and validation (2026-03-23) +- Build 2: Full destroy/rebuild cycle successful (2026-03-23) + +The platform is now stable with cloud provider integration and persistent volume support.