Files
micqdf 6c6b9d20ca
Deploy Cluster / Ansible (push) Has been cancelled
Deploy Cluster / Terraform (push) Has been cancelled
update README
2026-04-22 01:14:21 +00:00

2.9 KiB

AGENTS.md

Repository guide for OpenCode sessions in this repo.

Read First

  • Trust manifests and workflows over prose when they conflict.
  • Highest-value sources: terraform/main.tf, terraform/variables.tf, ansible/site.yml, clusters/prod/flux-system/, infrastructure/addons/kustomization.yaml, .gitea/workflows/deploy.yml, .gitea/workflows/destroy.yml, README.md, STABLE_BASELINE.md, scripts/refresh-kubeconfig.sh, scripts/smoke-check-tailnet-services.sh.

Current Baseline

  • HA private cluster: 3 control planes, 3 workers.
  • Tailscale is the private access path for Rancher and shared services.
  • Rancher, Grafana, and Prometheus are exposed through Tailscale; Flux UI / Weave GitOps is removed.
  • apps/ is suspended by default.
  • Rancher stores state in embedded etcd; backup/restore uses rancher-backup to B2.

Common Commands

  • Terraform: terraform -chdir=terraform fmt -recursive, terraform -chdir=terraform validate, terraform -chdir=terraform plan -var-file=../terraform.tfvars, terraform -chdir=terraform apply -var-file=../terraform.tfvars
  • Ansible: ansible-galaxy collection install -r ansible/requirements.yml, cd ansible && python3 generate_inventory.py, ansible-playbook -i ansible/inventory.ini ansible/site.yml --syntax-check, ansible-playbook ansible/site.yml
  • Flux/Kustomize: kubectl kustomize infrastructure/addons/<addon>, kubectl kustomize clusters/prod/flux-system
  • Kubeconfig refresh: scripts/refresh-kubeconfig.sh <cp1-public-ip>
  • Tailnet smoke check: ssh root@<cp1-ip> 'bash -s' < scripts/smoke-check-tailnet-services.sh

Workflow Rules

  • Keep diffs small and validate only the directory you edited.
  • Update manifests and docs together when behavior changes.
  • Use set -euo pipefail in workflow shell blocks.
  • CI deploy order is Terraform -> Ansible -> Flux bootstrap -> Rancher restore -> health checks.
  • One object per Kubernetes YAML file; keep filenames kebab-case.
  • If kubectl points at localhost:8080 after a rebuild, refresh kubeconfig from the primary control-plane IP.

Repo-Specific Gotchas

  • rancher-backup uses a postRenderer to swap the broken hook image to rancher/kubectl:v1.34.0; do not put S3 config in HelmRelease values. Put it in the Backup CR.
  • Tailscale cleanup only runs before service proxies exist; it removes stale offline rancher/grafana/prometheus/flux devices, then must stop so live proxies are not deleted.
  • Keep the Tailscale operator on the stable Helm repo https://pkgs.tailscale.com/helmcharts at 1.96.5 unless you have a reason to change it.
  • Current private URLs:
    • Rancher: https://rancher.silverside-gopher.ts.net/
    • Grafana: http://grafana.silverside-gopher.ts.net/
    • Prometheus: http://prometheus.silverside-gopher.ts.net:9090/

Secrets

  • Runtime secrets live in Doppler + External Secrets.
  • Bootstrap and CI secrets stay in Gitea; never commit secrets, kubeconfigs, or private keys.