refactor: simplify stable cluster baseline
This commit is contained in:
@@ -88,12 +88,8 @@ jobs:
|
||||
}
|
||||
|
||||
ensure_import 'hcloud_server.control_plane[0]' 'k8s-cluster-cp-1'
|
||||
ensure_import 'hcloud_server.control_plane[1]' 'k8s-cluster-cp-2'
|
||||
ensure_import 'hcloud_server.control_plane[2]' 'k8s-cluster-cp-3'
|
||||
ensure_import 'hcloud_server.workers[0]' 'k8s-cluster-worker-1'
|
||||
ensure_import 'hcloud_server.workers[1]' 'k8s-cluster-worker-2'
|
||||
ensure_import 'hcloud_server.workers[2]' 'k8s-cluster-worker-3'
|
||||
ensure_import 'hcloud_server.workers[3]' 'k8s-cluster-worker-4'
|
||||
|
||||
- name: Terraform Plan
|
||||
id: plan
|
||||
|
||||
28
README.md
28
README.md
@@ -179,6 +179,20 @@ Set these in your Gitea repository settings (**Settings** → **Secrets** → **
|
||||
|
||||
This repo uses Flux for continuous reconciliation after Terraform + Ansible bootstrap.
|
||||
|
||||
### Stable private-only baseline
|
||||
|
||||
The current default target is a deliberately simplified baseline:
|
||||
|
||||
- `1` control plane node
|
||||
- `2` worker nodes
|
||||
- private Hetzner network only
|
||||
- Tailscale for operator access
|
||||
- Flux-managed core addons only
|
||||
|
||||
Detailed phase gates and success criteria live in `STABLE_BASELINE.md`.
|
||||
|
||||
This is the default until rebuilds are consistently green. High availability, public ingress, and app-layer expansion come later.
|
||||
|
||||
### Runtime secrets
|
||||
|
||||
Runtime cluster secrets are moving to Doppler + External Secrets Operator.
|
||||
@@ -222,6 +236,20 @@ Terraform/bootstrap secrets remain in Gitea Actions secrets and are not managed
|
||||
- Core infrastructure addons are Flux-managed from `infrastructure/addons/`.
|
||||
- Active Flux addons include `addon-ccm`, `addon-csi`, `addon-tailscale-operator`, `addon-tailscale-proxyclass`, `addon-external-secrets`, `addon-observability`, and `addon-observability-content`.
|
||||
- Ansible is limited to cluster bootstrap, private-access setup, and prerequisite secret creation for Flux-managed addons.
|
||||
- `addon-flux-ui` is optional for the stable-baseline phase and is not a blocker for rebuild success.
|
||||
|
||||
### Stable baseline acceptance
|
||||
|
||||
A rebuild is considered successful only when all of the following pass without manual intervention:
|
||||
|
||||
- Terraform create succeeds for the default `1` control plane and `2` workers.
|
||||
- Ansible bootstrap succeeds end-to-end.
|
||||
- All nodes become `Ready`.
|
||||
- `hcloud-cloud-controller-manager` and `hcloud-csi` are `Ready`.
|
||||
- Required External Secrets sync successfully.
|
||||
- Tailscale private access works.
|
||||
- Grafana and Prometheus are reachable privately.
|
||||
- Terraform destroy succeeds cleanly or succeeds after workflow retries.
|
||||
|
||||
## Observability Stack
|
||||
|
||||
|
||||
47
STABLE_BASELINE.md
Normal file
47
STABLE_BASELINE.md
Normal file
@@ -0,0 +1,47 @@
|
||||
# Stable Private-Only Baseline
|
||||
|
||||
This document defines the current engineering target for this repository.
|
||||
|
||||
## Topology
|
||||
|
||||
- 1 control plane
|
||||
- 2 workers
|
||||
- private Hetzner network
|
||||
- Tailscale operator access
|
||||
|
||||
## In Scope
|
||||
|
||||
- Terraform infrastructure bootstrap
|
||||
- Ansible k3s bootstrap
|
||||
- Flux core reconciliation
|
||||
- Hetzner CCM
|
||||
- Hetzner CSI
|
||||
- External Secrets Operator with Doppler
|
||||
- Tailscale private access
|
||||
- Observability stack
|
||||
|
||||
## Out of Scope
|
||||
|
||||
- HA control plane
|
||||
- public ingress or DNS
|
||||
- public TLS
|
||||
- app workloads
|
||||
- DR / backup strategy
|
||||
- upgrade strategy
|
||||
|
||||
## Phase Gates
|
||||
|
||||
1. Terraform apply completes for the default topology.
|
||||
2. k3s server bootstrap completes and kubeconfig works.
|
||||
3. Workers join and all nodes are Ready.
|
||||
4. Flux source and infrastructure reconciliation are healthy.
|
||||
5. CCM is Ready.
|
||||
6. CSI is Ready and a PVC can bind.
|
||||
7. External Secrets sync required secrets.
|
||||
8. Tailscale private access works.
|
||||
9. Observability is healthy and reachable privately.
|
||||
10. Terraform destroy succeeds cleanly or via workflow retry.
|
||||
|
||||
## Success Criteria
|
||||
|
||||
The baseline is considered stable only after two consecutive fresh rebuilds pass all phase gates with no manual fixes.
|
||||
@@ -25,7 +25,7 @@ variable "cluster_name" {
|
||||
variable "control_plane_count" {
|
||||
description = "Number of control plane nodes"
|
||||
type = number
|
||||
default = 3
|
||||
default = 1
|
||||
}
|
||||
|
||||
variable "control_plane_type" {
|
||||
@@ -37,7 +37,7 @@ variable "control_plane_type" {
|
||||
variable "worker_count" {
|
||||
description = "Number of worker nodes"
|
||||
type = number
|
||||
default = 4
|
||||
default = 2
|
||||
}
|
||||
|
||||
variable "worker_type" {
|
||||
|
||||
Reference in New Issue
Block a user