HetznerTerra

Author	SHA1	Message	Date
MichaelFisher1997	45c899d2bd	Configure Weave GitOps to use Doppler-managed admin credentials All checks were successful Deploy Cluster / Terraform (push) Successful in 39s Details Deploy Cluster / Ansible (push) Successful in 4m41s Details Changes: - Enable adminUser creation but disable Helm-managed secret - Use ExternalSecret (cluster-user-auth) from Doppler instead - Doppler secrets: WEAVE_GITOPS_ADMIN_USERNAME and WEAVE_GITOPS_ADMIN_PASSWORD_BCRYPT_HASH - Added cluster-user-auth to viewSecretsResourceNames for RBAC Login credentials are now managed via Doppler and External Secrets Operator.	2026-03-24 01:01:30 +00:00
MichaelFisher1997	0e52d8f159	Use Tailscale DNS names instead of IPs for TLS SANs All checks were successful Deploy Cluster / Terraform (push) Successful in 2m21s Details Deploy Cluster / Ansible (push) Successful in 9m0s Details Changed from hardcoded Tailscale IPs to DNS names: - k8s-cluster-cp-1.silverside-gopher.ts.net - k8s-cluster-cp-2.silverside-gopher.ts.net - k8s-cluster-cp-3.silverside-gopher.ts.net This is more robust since Tailscale IPs change on rebuild, but DNS names remain consistent. After next rebuild, cluster accessible via: - kubectl --server=https://k8s-cluster-cp-1.silverside-gopher.ts.net:6443	2026-03-23 23:50:48 +00:00
MichaelFisher1997	4726db2b5b	Add Tailscale IPs to k3s TLS SANs for secure tailnet access All checks were successful Deploy Cluster / Terraform (push) Successful in 2m30s Details Deploy Cluster / Ansible (push) Successful in 9m48s Details Changes: - Add tailscale_control_plane_ips list to k3s-server defaults - Include all 3 control plane Tailscale IPs (100.120.55.97, 100.108.90.123, 100.92.149.85) - Update primary k3s install to add Tailscale IPs to TLS certificates - Enables kubectl access via Tailscale without certificate errors After next deploy, cluster will be accessible via: - kubectl --server=https://100.120.55.97:6443 (or any CP tailscale IP) - kubectl --server=https://k8s-cluster-cp-1:6443 (via tailscale DNS)	2026-03-23 23:04:00 +00:00
MichaelFisher1997	90d105e5ea	Fix kube_api_endpoint variable passing for HA cluster All checks were successful Deploy Cluster / Terraform (push) Successful in 2m18s Details Deploy Cluster / Ansible (push) Successful in 8m55s Details - Remove circular variable reference in site.yml - Add kube_api_endpoint default to k3s-server role - Variable is set via inventory group_vars and passed to role - Primary CP now correctly adds LB IP to TLS SANs Note: Existing cluster needs destroy/rebuild to regenerate certificates.	2026-03-23 03:01:53 +00:00
MichaelFisher1997	952a80a742	Fix HA cluster join via Load Balancer private IP Some checks failed Deploy Cluster / Terraform (push) Successful in 36s Details Deploy Cluster / Ansible (push) Failing after 3m5s Details Changes: - Use LB private IP (10.0.1.5) instead of public IP for cluster joins - Add LB private IP to k3s TLS SANs on primary control plane - This allows secondary CPs and workers to verify certificates when joining via LB Fixes x509 certificate validation error when joining via LB public IP.	2026-03-23 02:56:41 +00:00
MichaelFisher1997	4965017b86	Fix Load Balancer network attachment Some checks failed Deploy Cluster / Terraform (push) Successful in 54s Details Deploy Cluster / Ansible (push) Failing after 3m44s Details Add hcloud_load_balancer_network resource to attach LB to private network. This is required before targets can use use_private_ip=true. LB gets IP 10.0.1.5 on the private network.	2026-03-23 02:44:35 +00:00
MichaelFisher1997	b2b9c38b91	Fix Load Balancer output attribute - use ipv4 instead of ipv4_address Some checks failed Deploy Cluster / Terraform (push) Failing after 1m37s Details Deploy Cluster / Ansible (push) Has been skipped Details	2026-03-23 02:40:50 +00:00
MichaelFisher1997	ff31cb4e74	Implement HA control plane with Load Balancer (3-3 topology) Some checks failed Deploy Cluster / Terraform (push) Failing after 10s Details Deploy Cluster / Ansible (push) Has been skipped Details Major changes: - Terraform: Scale to 3 control planes (cx23) + 3 workers (cx33) - Terraform: Add Hetzner Load Balancer (lb11) for Kubernetes API - Terraform: Add kube_api_lb_ip output - Ansible: Add community.network collection to requirements - Ansible: Update inventory to include LB endpoint - Ansible: Configure secondary CPs and workers to join via LB - Ansible: Add k3s_join_endpoint variable for HA joins - Workflow: Add imports for cp-2, cp-3, and worker-3 - Docs: Update STABLE_BASELINE.md with HA topology and phase gates Topology: - 3 control planes (cx23 - 2 vCPU, 8GB RAM each) - 3 workers (cx33 - 4 vCPU, 16GB RAM each) - 1 Load Balancer (lb11) routing to all 3 control planes on port 6443 - Workers and secondary CPs join via LB endpoint for HA Cost impact: +~€26/month (2 extra CPs + 1 extra worker + LB)	2026-03-23 02:39:39 +00:00
MichaelFisher1997	8b4a445b37	Update STABLE_BASELINE.md - CCM/CSI integration achieved All checks were successful Deploy Cluster / Terraform (push) Successful in 31s Details Deploy Cluster / Ansible (push) Successful in 3m36s Details Document the successful completion of Hetzner CCM and CSI integration: - CCM deployed via Ansible before workers join (fixes uninitialized taint) - CSI provides hcloud-volumes StorageClass for persistent storage - Two consecutive rebuilds passed all phase gates - PVC provisioning tested and working Platform now has full cloud provider integration with persistent volumes.	2026-03-23 02:25:00 +00:00
MichaelFisher1997	e447795395	Install helm binary in ccm-deploy role before using it All checks were successful Deploy Cluster / Terraform (push) Successful in 2m1s Details Deploy Cluster / Ansible (push) Successful in 6m35s Details The kubernetes.core.helm module requires helm CLI to be installed on the target node. Added check and install step using the official helm install script.	2026-03-23 00:07:39 +00:00
MichaelFisher1997	31b82c9371	Deploy CCM via Ansible before workers join to fix external cloud provider Some checks failed Deploy Cluster / Terraform (push) Successful in 31s Details Deploy Cluster / Ansible (push) Failing after 1m48s Details This fixes the chicken-and-egg problem where workers with --kubelet-arg=cloud-provider=external couldn't join because CCM wasn't running yet to remove the node.cloudprovider.kubernetes.io/uninitialized taint. Changes: - Create ansible/roles/ccm-deploy/ to deploy CCM via Helm during Ansible phase - Reorder site.yml: CCM deploys after secrets but before workers join - CCM runs on control_plane[0] with proper tolerations for control plane nodes - Add 10s pause after CCM ready to ensure it can process new nodes - Workers can now successfully join with external cloud provider enabled Flux still manages CCM for updates, but initial install happens in Ansible.	2026-03-22 23:58:03 +00:00
MichaelFisher1997	cadfedacf1	Fix providerID health check - use shell module for piped grep Some checks failed Deploy Cluster / Terraform (push) Successful in 1m47s Details Deploy Cluster / Ansible (push) Failing after 18m4s Details	2026-03-22 22:55:55 +00:00
MichaelFisher1997	561cd67b0c	Enable Hetzner CCM and CSI for cloud provider integration Some checks failed Deploy Cluster / Terraform (push) Successful in 30s Details Deploy Cluster / Ansible (push) Failing after 3m21s Details - Enable --kubelet-arg=cloud-provider=external on all nodes (control planes and workers) - Activate CCM Kustomization with 10m timeout for Hetzner cloud-controller-manager - Activate CSI Kustomization with dependsOn CCM and 10m timeout for hcloud-csi - Update deploy workflow to wait for CCM/CSI readiness (600s timeout) - Add providerID verification to post-deploy health checks This enables proper cloud provider integration with Hetzner CCM for node labeling and Hetzner CSI for persistent volume provisioning.	2026-03-22 22:26:21 +00:00
MichaelFisher1997	4eebbca648	docs: update README for deferred observability baseline All checks were successful Deploy Cluster / Terraform (push) Successful in 1m41s Details Deploy Cluster / Ansible (push) Successful in 5m37s Details	2026-03-22 01:04:53 +00:00
MichaelFisher1997	7b5d794dfc	fix: update health checks for deferred observability Some checks failed Deploy Cluster / Ansible (push) Has been cancelled Details Deploy Cluster / Terraform (push) Has been cancelled Details	2026-03-22 01:04:27 +00:00
MichaelFisher1997	8643bbfc12	fix: defer observability to get clean baseline Some checks failed Deploy Cluster / Ansible (push) Has been cancelled Details Deploy Cluster / Terraform (push) Has been cancelled Details	2026-03-22 01:03:55 +00:00
MichaelFisher1997	84f446c2e6	fix: restore observability timeouts to 5 minutes Some checks failed Deploy Cluster / Terraform (push) Successful in 32s Details Deploy Cluster / Ansible (push) Failing after 8m38s Details	2026-03-22 00:43:37 +00:00
MichaelFisher1997	d446e86ece	fix: use static grafana password, remove externalsecret dependency Some checks failed Deploy Cluster / Ansible (push) Has been cancelled Details Deploy Cluster / Terraform (push) Has been cancelled Details	2026-03-22 00:43:21 +00:00
MichaelFisher1997	90c7f565e0	fix: remove tailscale ingress dependencies from observability Some checks failed Deploy Cluster / Terraform (push) Successful in 39s Details Deploy Cluster / Ansible (push) Has been cancelled Details	2026-03-22 00:42:35 +00:00
MichaelFisher1997	989848fa89	fix: increase observability timeouts to 10 minutes Some checks failed Deploy Cluster / Terraform (push) Successful in 2m1s Details Deploy Cluster / Ansible (push) Failing after 13m54s Details	2026-03-21 19:34:43 +00:00
MichaelFisher1997	56e5807474	fix: create doppler ClusterSecretStore after ESO is installed Some checks failed Deploy Cluster / Terraform (push) Successful in 47s Details Deploy Cluster / Ansible (push) Failing after 8m31s Details	2026-03-21 19:19:43 +00:00
MichaelFisher1997	df0511148c	fix: unsuspend tailscale operator for stable baseline Some checks failed Deploy Cluster / Terraform (push) Successful in 41s Details Deploy Cluster / Ansible (push) Failing after 8m44s Details	2026-03-21 19:03:39 +00:00
MichaelFisher1997	894e6275b1	docs: update stable baseline to defer ccm/csi Some checks failed Deploy Cluster / Terraform (push) Successful in 28s Details Deploy Cluster / Ansible (push) Failing after 8m35s Details	2026-03-21 18:41:36 +00:00
MichaelFisher1997	a01cf435d4	fix: skip ccm/csi waits for stable baseline - using k3s embedded Some checks failed Deploy Cluster / Terraform (push) Successful in 37s Details Deploy Cluster / Ansible (push) Has been cancelled Details	2026-03-21 18:40:53 +00:00
MichaelFisher1997	84f77c4a68	fix: use kubectl patch instead of apply for flux controller nodeSelector Some checks failed Deploy Cluster / Terraform (push) Successful in 38s Details Deploy Cluster / Ansible (push) Failing after 9m41s Details	2026-03-21 18:05:41 +00:00
MichaelFisher1997	2e4196688c	fix: bootstrap flux in phases - crds first, then resources Some checks failed Deploy Cluster / Terraform (push) Successful in 38s Details Deploy Cluster / Ansible (push) Failing after 3m19s Details	2026-03-21 17:42:39 +00:00
MichaelFisher1997	8d1f9f4944	fix: add k3s reset logic for primary control plane Some checks failed Deploy Cluster / Terraform (push) Successful in 39s Details Deploy Cluster / Ansible (push) Failing after 4m19s Details	2026-03-21 16:10:17 +00:00
MichaelFisher1997	d4fd43e2f5	refactor: simplify k3s-server bootstrap for	2026-03-21 15:48:33 +00:00
MichaelFisher1997	48a80c362c	fix: disable external cloud-provider kubelet arg for stable baseline Some checks failed Deploy Cluster / Terraform (push) Successful in 50s Details Deploy Cluster / Ansible (push) Failing after 4m21s Details	2026-03-21 14:36:54 +00:00
MichaelFisher1997	fcf7f139ff	fix: use public api endpoint for flux bootstrap Some checks failed Deploy Cluster / Terraform (push) Successful in 41s Details Deploy Cluster / Ansible (push) Failing after 2m16s Details	2026-03-21 00:07:51 +00:00
MichaelFisher1997	7139ae322d	fix: bootstrap flux during cluster deploy Some checks failed Deploy Cluster / Terraform (push) Successful in 38s Details Deploy Cluster / Ansible (push) Failing after 3m21s Details	2026-03-20 10:37:11 +00:00
MichaelFisher1997	528a8dc210	fix: defer doppler store until eso is installed Some checks failed Deploy Cluster / Terraform (push) Successful in 45s Details Deploy Cluster / Ansible (push) Failing after 24m34s Details	2026-03-20 09:30:17 +00:00
MichaelFisher1997	349f75729a	fix: bootstrap tailscale namespace before secret Some checks failed Deploy Cluster / Terraform (push) Successful in 44s Details Deploy Cluster / Ansible (push) Failing after 3m30s Details	2026-03-20 09:24:35 +00:00
MichaelFisher1997	522626a52b	refactor: simplify stable cluster baseline Some checks failed Deploy Cluster / Terraform (push) Successful in 1m48s Details Deploy Cluster / Ansible (push) Failing after 4m7s Details	2026-03-20 02:24:37 +00:00
MichaelFisher1997	5bd4c41c2d	fix: restore k3s agent bootstrap Some checks failed Deploy Cluster / Terraform (push) Successful in 49s Details Deploy Cluster / Ansible (push) Failing after 18m16s Details	2026-03-20 01:50:16 +00:00
MichaelFisher1997	3e41f71b1b	fix: harden terraform destroy workflow Some checks failed Deploy Cluster / Terraform (push) Successful in 2m28s Details Deploy Cluster / Ansible (push) Failing after 20m4s Details	2026-03-19 23:26:03 +00:00
MichaelFisher1997	9d2f30de32	fix: prepare k3s for external cloud provider All checks were successful Deploy Cluster / Terraform (push) Successful in 46s Details Deploy Cluster / Ansible (push) Successful in 4m4s Details	2026-03-17 01:21:23 +00:00
MichaelFisher1997	08a3031276	refactor: retire imperative addon roles All checks were successful Deploy Cluster / Terraform (push) Successful in 52s Details Deploy Cluster / Ansible (push) Successful in 4m2s Details	2026-03-17 01:04:02 +00:00
MichaelFisher1997	e3ce91db62	fix: align flux ccm with live deployment All checks were successful Deploy Cluster / Terraform (push) Successful in 47s Details Deploy Cluster / Ansible (push) Successful in 3m56s Details	2026-03-11 18:17:16 +00:00
MichaelFisher1997	bed8e4afc8	feat: migrate core addons toward flux All checks were successful Deploy Cluster / Terraform (push) Successful in 49s Details Deploy Cluster / Ansible (push) Successful in 4m6s Details	2026-03-11 17:43:35 +00:00
MichaelFisher1997	2d4de6cff8	fix: bootstrap doppler store outside flux All checks were successful Deploy Cluster / Terraform (push) Successful in 43s Details Deploy Cluster / Ansible (push) Successful in 9m42s Details	2026-03-09 02:58:26 +00:00
MichaelFisher1997	4a83d981c8	fix: skip dry-run validation for doppler store sync Some checks failed Deploy Cluster / Terraform (push) Successful in 44s Details Deploy Cluster / Ansible (push) Has been cancelled Details	2026-03-09 02:52:08 +00:00
MichaelFisher1997	d188a51ef6	fix: move doppler store manifests out of ignored path Some checks failed Deploy Cluster / Terraform (push) Successful in 45s Details Deploy Cluster / Ansible (push) Has been cancelled Details	2026-03-09 02:45:46 +00:00
MichaelFisher1997	646ef16258	fix: stabilize flux and external secrets reconciliation All checks were successful Deploy Cluster / Terraform (push) Successful in 48s Details Deploy Cluster / Ansible (push) Successful in 9m42s Details	2026-03-09 02:25:27 +00:00
MichaelFisher1997	6f2e056b98	feat: sync runtime secrets from doppler All checks were successful Deploy Cluster / Terraform (push) Successful in 45s Details Deploy Cluster / Ansible (push) Successful in 9m56s Details	2026-03-09 00:25:41 +00:00
MichaelFisher1997	e10a70475f	fix: right-size flux observability workloads All checks were successful Deploy Cluster / Terraform (push) Successful in 47s Details Deploy Cluster / Ansible (push) Successful in 9m37s Details	2026-03-08 05:17:22 +00:00
MichaelFisher1997	f95e0051a5	feat: automate private tailnet access on cp1 All checks were successful Deploy Cluster / Terraform (push) Successful in 47s Details Deploy Cluster / Ansible (push) Successful in 9m45s Details	2026-03-08 04:16:06 +00:00
MichaelFisher1997	7c15ac5846	feat: add flux ui on shared tailscale endpoint All checks were successful Deploy Cluster / Terraform (push) Successful in 46s Details Deploy Cluster / Ansible (push) Successful in 9m40s Details	2026-03-07 12:30:17 +00:00
MichaelFisher1997	4c104f74e8	feat: route observability through one tailscale endpoint All checks were successful Deploy Cluster / Terraform (push) Successful in 51s Details Deploy Cluster / Ansible (push) Successful in 9m33s Details	2026-03-07 01:04:03 +00:00
MichaelFisher1997	be04602bfb	fix: make flux bootstrap reachable from cluster All checks were successful Deploy Cluster / Terraform (push) Successful in 47s Details Deploy Cluster / Ansible (push) Successful in 9m59s Details	2026-03-07 00:38:29 +00:00

1 2 3 4

176 Commits