feat: migrate observability stack to flux gitops
Some checks failed
Deploy Cluster / Terraform (push) Successful in 45s
Deploy Cluster / Ansible (push) Failing after 1m11s

This commit is contained in:
2026-03-04 23:38:40 +00:00
parent 480a079dc8
commit 8b403cd1d6
28 changed files with 493 additions and 1 deletions

View File

@@ -174,6 +174,47 @@ Set these in your Gitea repository settings (**Settings** → **Secrets** → **
| `SSH_PUBLIC_KEY` | SSH public key content |
| `SSH_PRIVATE_KEY` | SSH private key content |
## GitOps (Flux)
This repo now includes a Flux GitOps layout for phased migration from imperative Ansible applies to continuous reconciliation.
### Repository layout
- `clusters/prod/`: cluster entrypoint and Flux reconciliation objects
- `clusters/prod/flux-system/`: `GitRepository` source and top-level `Kustomization` graph
- `infrastructure/`: infrastructure addon reconciliation graph
- `infrastructure/addons/*`: per-addon manifests (observability + observability-content migrated)
- `apps/`: application workload layer (currently scaffolded)
### Reconciliation graph
- `infrastructure` (top-level)
- `addon-ccm`
- `addon-csi` depends on `addon-ccm`
- `addon-tailscale-operator`
- `addon-observability`
- `addon-observability-content` depends on `addon-observability`
- `apps` depends on `infrastructure`
### Bootstrap notes
1. Install Flux controllers in `flux-system`.
2. Create the Flux deploy key/secret named `flux-system` in `flux-system` namespace.
3. Apply `clusters/prod/flux-system/` once to establish source + reconciliation graph.
4. Unsuspend addon `Kustomization` objects one-by-one as each addon is migrated from Ansible.
### Current migration status
- `addon-observability-content` is now GitOps-managed from `infrastructure/addons/observability-content/`.
- `addon-observability` is now GitOps-managed from `infrastructure/addons/observability/` using Flux `HelmRelease` resources for:
- `kube-prometheus-stack`
- `loki`
- `promtail`
- Remaining addons stay suspended until migrated.
- During transition, avoid applying Grafana content from both Flux and Ansible at the same time.
Ansible `site.yml` now skips `observability` and `observability-content` roles by default when `observability_gitops_enabled=true` (default).
## Observability Stack
The Ansible playbook deploys a lightweight observability stack in the `observability` namespace:
@@ -182,7 +223,7 @@ The Ansible playbook deploys a lightweight observability stack in the `observabi
- `loki`
- `promtail`
Grafana content is managed as code via ConfigMaps in `ansible/roles/observability-content/`.
Grafana content is managed as code via ConfigMaps in `infrastructure/addons/observability-content/` (Flux), migrated from `ansible/roles/observability-content/`.
Services are kept internal by default, with optional declarative Tailscale exposure when the Tailscale Kubernetes Operator is healthy.