Commit Graph

78 Commits

Author SHA1 Message Date
micqdf c251672618 fix: Configure S3 bucketName for rancher-backup operator
Deploy Cluster / Terraform (push) Successful in 50s
Deploy Cluster / Ansible (push) Has been cancelled
2026-03-29 23:01:18 +00:00
micqdf 89364e8f37 fix: Add dependsOn for rancher-backup operator to wait for CRDs
Deploy Cluster / Terraform (push) Successful in 50s
Deploy Cluster / Ansible (push) Has been cancelled
2026-03-29 22:57:22 +00:00
micqdf 20d7a6f777 fix: Install rancher-backup CRD chart before operator
Deploy Cluster / Terraform (push) Successful in 50s
Deploy Cluster / Ansible (push) Has been cancelled
The rancher-backup operator requires CRDs from the rancher-backup-crd
chart to be installed first.
2026-03-29 22:51:34 +00:00
micqdf 22ce5fd6f4 feat: Add cert-manager as dependency for Rancher
Deploy Cluster / Terraform (push) Successful in 49s
Deploy Cluster / Ansible (push) Successful in 5m59s
Rancher requires cert-manager when managing its own TLS (not tls:external).
Added cert-manager HelmRelease with CRDs enabled.
2026-03-29 22:36:30 +00:00
micqdf afb1782d38 fix: Separate Backup CRs into their own kustomization
Deploy Cluster / Terraform (push) Successful in 41s
Deploy Cluster / Ansible (push) Successful in 5m57s
The Backup and Restore CRs need the rancher-backup CRDs to exist first.
Moved them to a separate kustomization that depends on the operator being ready.
2026-03-29 22:22:29 +00:00
micqdf 48870433bf fix: Remove tls:external from Rancher HelmRelease
Deploy Cluster / Terraform (push) Failing after 55s
Deploy Cluster / Ansible (push) Has been skipped
With Tailscale LoadBalancer, TLS is not actually terminated at the edge.
The Tailscale proxy does TCP passthrough, so Rancher must serve its own
TLS certs. Setting tls: external caused Rancher to listen HTTP-only,
which broke HTTPS access through Tailscale.
2026-03-29 22:19:23 +00:00
micqdf f2c506b350 refactor: Replace CNPG external DB with rancher-backup operator
Deploy Cluster / Terraform (push) Successful in 48s
Deploy Cluster / Ansible (push) Successful in 6m5s
Rancher 2.x uses embedded etcd, not an external PostgreSQL database.
The CATTLE_DB_CATTLE_* env vars are Rancher v1 only and were ignored.

- Remove all CNPG (CloudNativePG) cluster, operator, and related configs
- Remove external DB env vars from Rancher HelmRelease
- Remove rancher-db-password ExternalSecret
- Add rancher-backup operator HelmRelease (v106.0.2+up8.1.0)
- Add B2 credentials ExternalSecret for backup storage
- Add recurring Backup CR (daily at 03:00, 7 day retention)
- Add commented-out Restore CR for rebuild recovery
- Update Flux dependency graph accordingly
2026-03-29 21:53:16 +00:00
micqdf 905d069e91 fix: Add serverName to CNPG externalClusters for B2 recovery
Deploy Cluster / Terraform (push) Successful in 49s
Deploy Cluster / Ansible (push) Successful in 5m22s
CNPG uses the external cluster name (b2-backup) as the barman server
name by default, but the backups were stored under server name rancher-db.
2026-03-29 03:22:19 +00:00
micqdf 25ba4b7115 fix: Add skipEmptyWalArchiveCheck annotation and B2 secret healthcheck to CNPG
Deploy Cluster / Terraform (push) Successful in 49s
Deploy Cluster / Ansible (push) Successful in 5m22s
- Skip WAL archive emptiness check so recovery works when restoring over
  an existing backup archive in B2
- Add healthCheck for b2-credentials secret in CNPG kustomization to
  prevent recovery from starting before ExternalSecret has synced
2026-03-29 03:15:23 +00:00
micqdf 6a593fd559 feat: Add B2 recovery bootstrap to CNPG cluster
Deploy Cluster / Terraform (push) Successful in 2m6s
Deploy Cluster / Ansible (push) Successful in 8m16s
2026-03-29 00:22:24 +00:00
micqdf 936f54a1b5 fix: Restore canonical Rancher tailnet hostname
Deploy Cluster / Terraform (push) Successful in 48s
Deploy Cluster / Ansible (push) Successful in 6m1s
2026-03-29 00:00:39 +00:00
micqdf c9df11e65f fix: Align Rancher tailnet hostname with live proxy
Deploy Cluster / Terraform (push) Successful in 49s
Deploy Cluster / Ansible (push) Successful in 6m1s
2026-03-28 23:47:09 +00:00
micqdf a3c238fda9 fix: Apply Rancher server URL after chart install
Deploy Cluster / Terraform (push) Successful in 2m43s
Deploy Cluster / Ansible (push) Successful in 10m39s
2026-03-28 23:12:59 +00:00
micqdf a15fa50302 fix: Use Doppler-backed Rancher bootstrap password
Deploy Cluster / Terraform (push) Successful in 49s
Deploy Cluster / Ansible (push) Successful in 5m43s
2026-03-28 22:51:38 +00:00
micqdf 0f4f0b09fb fix: Add Rancher DB password ExternalSecret
Deploy Cluster / Terraform (push) Successful in 49s
Deploy Cluster / Ansible (push) Successful in 5m42s
2026-03-28 22:42:05 +00:00
micqdf 4c002a870c fix: Remove invalid Rancher server-url manifest
Deploy Cluster / Terraform (push) Successful in 51s
Deploy Cluster / Ansible (push) Has been cancelled
2026-03-28 22:39:31 +00:00
micqdf 8c5edcf0a1 fix: Set Rancher server URL to tailnet hostname
Deploy Cluster / Terraform (push) Successful in 1m0s
Deploy Cluster / Ansible (push) Successful in 6m27s
2026-03-28 04:07:44 +00:00
micqdf a81da0d178 feat: Expose Rancher via Tailscale hostname
Deploy Cluster / Terraform (push) Successful in 52s
Deploy Cluster / Ansible (push) Successful in 6m42s
2026-03-28 03:59:02 +00:00
micqdf 2a72527c79 fix: Switch Traefik from LoadBalancer to NodePort, remove unused Hetzner LB
Deploy Cluster / Terraform (push) Successful in 49s
Deploy Cluster / Ansible (push) Successful in 6m25s
2026-03-28 03:21:19 +00:00
micqdf 7cb3b84ecb feat: Replace custom pgdump job with CNPG ScheduledBackup
Deploy Cluster / Terraform (push) Successful in 1m30s
Deploy Cluster / Ansible (push) Has been cancelled
2026-03-28 03:15:39 +00:00
micqdf d4930235fa fix: Point CNPG backups at the existing B2 bucket
Deploy Cluster / Terraform (push) Successful in 48s
Deploy Cluster / Ansible (push) Successful in 6m17s
2026-03-26 23:35:19 +00:00
micqdf ee8dc4b451 fix: Add Role for B2 credentials access
Deploy Cluster / Terraform (push) Successful in 49s
Deploy Cluster / Ansible (push) Successful in 6m29s
2026-03-26 23:04:40 +00:00
micqdf 144d40e7ac feat: Add RBAC for CNP to read B2 credentials secret
Deploy Cluster / Terraform (push) Successful in 48s
Deploy Cluster / Ansible (push) Successful in 6m38s
2026-03-26 22:56:00 +00:00
micqdf cc14e32572 fix: Use gzip instead of lzop for backup compression
Deploy Cluster / Terraform (push) Successful in 49s
Deploy Cluster / Ansible (push) Has been cancelled
2026-03-26 22:51:10 +00:00
micqdf a207a5a7fd fix: Remove invalid encryption field from CNP backup config
Deploy Cluster / Terraform (push) Successful in 40s
Deploy Cluster / Ansible (push) Has been cancelled
2026-03-26 22:49:29 +00:00
micqdf 4e1772c175 feat: Add B2 backup configuration to CNP Cluster
Deploy Cluster / Terraform (push) Successful in 1m38s
Deploy Cluster / Ansible (push) Has been cancelled
2026-03-26 22:47:31 +00:00
micqdf a3963c56e6 cleanup: Remove traefik-config, simplify traefik helmrelease
Deploy Cluster / Terraform (push) Successful in 50s
Deploy Cluster / Ansible (push) Successful in 6m20s
2026-03-26 03:16:56 +00:00
micqdf 612435c42c fix: Add Hetzner LB health check config to Traefik
Deploy Cluster / Terraform (push) Successful in 47s
Deploy Cluster / Ansible (push) Has been cancelled
2026-03-26 03:11:10 +00:00
micqdf ac42f671a2 fix: Remove addon-traefik-config dependency from flux-ui
Deploy Cluster / Terraform (push) Successful in 50s
Deploy Cluster / Ansible (push) Has been cancelled
2026-03-26 03:05:58 +00:00
micqdf dbe7ec0468 fix: Remove expose boolean from traefik ports config
Deploy Cluster / Terraform (push) Successful in 49s
Deploy Cluster / Ansible (push) Has been cancelled
2026-03-26 03:01:13 +00:00
micqdf 816ac8b3c0 fix: Use official Traefik helm repo instead of rancher-stable
Deploy Cluster / Terraform (push) Successful in 48s
Deploy Cluster / Ansible (push) Has been cancelled
2026-03-26 02:59:00 +00:00
micqdf 6f7998639f fix: Use standard kustomize API in traefik addon
Deploy Cluster / Terraform (push) Successful in 48s
Deploy Cluster / Ansible (push) Has been cancelled
2026-03-26 02:56:52 +00:00
micqdf 7a14f89ad1 fix: Correct traefik kustomization path and sourceRef
Deploy Cluster / Terraform (push) Successful in 48s
Deploy Cluster / Ansible (push) Has been cancelled
2026-03-26 02:55:37 +00:00
micqdf 786901c5d7 fix: Correct traefik kustomization reference (directory not file)
Deploy Cluster / Terraform (push) Successful in 47s
Deploy Cluster / Ansible (push) Has been cancelled
2026-03-26 02:54:29 +00:00
micqdf 46f3d1130b feat: Add Flux-managed Traefik HelmRelease with Hetzner LB config
Deploy Cluster / Terraform (push) Successful in 48s
Deploy Cluster / Ansible (push) Has been cancelled
2026-03-26 02:52:49 +00:00
micqdf 2fe5a626d4 fix: Add Hetzner network zone annotation to Traefik LoadBalancer
Deploy Cluster / Terraform (push) Successful in 52s
Deploy Cluster / Ansible (push) Successful in 6m20s
2026-03-26 02:30:43 +00:00
micqdf 2ef68c8087 fix: Remove deprecated enablePodMonitor field in CNP Cluster
Deploy Cluster / Terraform (push) Successful in 2m13s
Deploy Cluster / Ansible (push) Successful in 10m15s
2026-03-26 01:01:53 +00:00
micqdf e2cae18f5f fix: Remove backup config for initial deployment - add backup after DB is running
Deploy Cluster / Terraform (push) Successful in 36s
Deploy Cluster / Ansible (push) Successful in 4m56s
2026-03-26 00:46:50 +00:00
micqdf e0c1e41ee9 fix: Remove bootstrap recovery - create fresh DB (recovery only needed after first backup)
Deploy Cluster / Terraform (push) Successful in 35s
Deploy Cluster / Ansible (push) Has been cancelled
2026-03-26 00:43:49 +00:00
micqdf 63533de901 fix: Fix retentionPolicy format (14d not keep14)
Deploy Cluster / Terraform (push) Successful in 47s
Deploy Cluster / Ansible (push) Has been cancelled
2026-03-26 00:41:44 +00:00
micqdf 1b39710f63 fix: Move retentionPolicy to correct location in backup spec
Deploy Cluster / Terraform (push) Successful in 37s
Deploy Cluster / Ansible (push) Has been cancelled
2026-03-26 00:39:25 +00:00
micqdf 8c034323dc fix: Fix Cluster CR with correct barmanObjectStore schema
Deploy Cluster / Terraform (push) Successful in 35s
Deploy Cluster / Ansible (push) Has been cancelled
2026-03-26 00:35:23 +00:00
micqdf 5fa2b411ee fix: Fix Cluster CR schema - use barmanObjectStore instead of b2
Deploy Cluster / Terraform (push) Successful in 35s
Deploy Cluster / Ansible (push) Has been cancelled
2026-03-26 00:33:04 +00:00
micqdf 3ea28e525f fix: Fix CNP operator image repository (cloudnative-pg not postgresql)
Deploy Cluster / Terraform (push) Successful in 43s
Deploy Cluster / Ansible (push) Successful in 4m55s
2026-03-26 00:21:09 +00:00
micqdf 4b95ba113d fix: Remove LPP helm (already installed by k3s), fix CNP chart version to 0.27.1
Deploy Cluster / Terraform (push) Successful in 36s
Deploy Cluster / Ansible (push) Successful in 5m7s
2026-03-26 00:13:22 +00:00
micqdf 13627bf81f fix: Split CNP operator from CNP cluster to fix CRD dependency
Deploy Cluster / Terraform (push) Successful in 35s
Deploy Cluster / Ansible (push) Successful in 5m0s
- Move CNP operator HelmRelease to cnpg-operator folder
- Create addon-cnpg-operator kustomization (deploys operator first)
- Update addon-cnpg to dependOn addon-cnpg-operator
- Add addon-cnpg as dependency for addon-rancher (needs database)
2026-03-26 00:06:34 +00:00
micqdf ef3fb2489a fix: Convert kustomization-lpp and kustomization-cnpg to Flux Kustomization CRs
Deploy Cluster / Terraform (push) Successful in 37s
Deploy Cluster / Ansible (push) Has been cancelled
2026-03-26 00:03:53 +00:00
micqdf 7097495d72 fix: Add missing metadata.name to kustomization-lpp and kustomization-cnpg
Deploy Cluster / Terraform (push) Successful in 1m7s
Deploy Cluster / Ansible (push) Has been cancelled
2026-03-25 23:39:45 +00:00
micqdf 9d601dc77c feat: Add CloudNativePG with B2 backups for persistent Rancher database
Deploy Cluster / Terraform (push) Successful in 4m16s
Deploy Cluster / Ansible (push) Failing after 12m27s
- Add Local Path Provisioner for storage
- Add CloudNativePG operator (v1.27.0) via Flux
- Create PostgreSQL cluster with B2 (Backblaze) auto-backup/restore
- Update Rancher to use external PostgreSQL via CATTLE_DB_CATTLE_* env vars
- Add weekly pg_dump CronJob to B2 (Sundays 2AM)
- Add pre-destroy backup hook to destroy workflow
- Add B2 credentials to Doppler (B2_ACCOUNT_ID, B2_APPLICATION_KEY)
- Generate RANCHER_DB_PASSWORD in Doppler

Backup location: HetznerTerra/rancher-backups/
Retention: 14 backups
2026-03-25 23:06:45 +00:00
micqdf 89c2c99963 Fix Rancher: remove conflicting LoadBalancer, add HTTPS port-forward, use tailscale serve only
Deploy Cluster / Terraform (push) Successful in 2m21s
Deploy Cluster / Ansible (push) Successful in 9m2s
2026-03-25 00:59:16 +00:00