refactor: simplify stable cluster baseline
Some checks failed
Deploy Cluster / Terraform (push) Successful in 1m48s
Deploy Cluster / Ansible (push) Failing after 4m7s

This commit is contained in:
2026-03-20 02:24:37 +00:00
parent 5bd4c41c2d
commit 522626a52b
4 changed files with 77 additions and 6 deletions

View File

@@ -179,6 +179,20 @@ Set these in your Gitea repository settings (**Settings** → **Secrets** → **
This repo uses Flux for continuous reconciliation after Terraform + Ansible bootstrap.
### Stable private-only baseline
The current default target is a deliberately simplified baseline:
- `1` control plane node
- `2` worker nodes
- private Hetzner network only
- Tailscale for operator access
- Flux-managed core addons only
Detailed phase gates and success criteria live in `STABLE_BASELINE.md`.
This is the default until rebuilds are consistently green. High availability, public ingress, and app-layer expansion come later.
### Runtime secrets
Runtime cluster secrets are moving to Doppler + External Secrets Operator.
@@ -222,6 +236,20 @@ Terraform/bootstrap secrets remain in Gitea Actions secrets and are not managed
- Core infrastructure addons are Flux-managed from `infrastructure/addons/`.
- Active Flux addons include `addon-ccm`, `addon-csi`, `addon-tailscale-operator`, `addon-tailscale-proxyclass`, `addon-external-secrets`, `addon-observability`, and `addon-observability-content`.
- Ansible is limited to cluster bootstrap, private-access setup, and prerequisite secret creation for Flux-managed addons.
- `addon-flux-ui` is optional for the stable-baseline phase and is not a blocker for rebuild success.
### Stable baseline acceptance
A rebuild is considered successful only when all of the following pass without manual intervention:
- Terraform create succeeds for the default `1` control plane and `2` workers.
- Ansible bootstrap succeeds end-to-end.
- All nodes become `Ready`.
- `hcloud-cloud-controller-manager` and `hcloud-csi` are `Ready`.
- Required External Secrets sync successfully.
- Tailscale private access works.
- Grafana and Prometheus are reachable privately.
- Terraform destroy succeeds cleanly or succeeds after workflow retries.
## Observability Stack