Files
HetznerTerra/AGENTS.md
MichaelFisher1997 6e5b0518be
All checks were successful
Deploy Cluster / Terraform (push) Successful in 53s
Deploy Cluster / Ansible (push) Successful in 5m25s
feat: Add kubeconfig refresh script and fix Ansible Finalize to use public IP
- scripts/refresh-kubeconfig.sh fetches a fresh kubeconfig from CP1
- Ansible site.yml Finalize step now uses public IP instead of Tailscale
  hostname for the kubeconfig server address
- Updated AGENTS.md with kubeconfig refresh instructions
2026-03-29 03:31:36 +00:00

7.8 KiB

AGENTS.md

Repository guide for agentic contributors working in this repo.

Scope

  • This is an infrastructure repository for a Hetzner + k3s + Flux stack.
  • Primary areas: terraform/, ansible/, clusters/, infrastructure/, .gitea/workflows/.
  • Treat README.md and STABLE_BASELINE.md as user-facing context, but prefer the repo's current manifests and workflows as the source of truth.
  • Keep changes small and reviewable; prefer the narrowest file set that solves the task.

Current Tooling

  • Terraform for cloud infra and state-backed provisioning.
  • Ansible for bootstrap, OS prep, k3s install, and pre-Flux prerequisites.
  • Flux/Kustomize for cluster and addon reconciliation.
  • Python for inventory generation (ansible/generate_inventory.py).

Important Files

  • terraform/main.tf - provider and version pins.
  • terraform/variables.tf - input surface and defaults.
  • terraform/*.tf - Hetzner network, firewall, servers, SSH, outputs.
  • ansible/site.yml - ordered bootstrap playbook.
  • ansible/generate_inventory.py - renders ansible/inventory.ini from Terraform outputs.
  • clusters/prod/flux-system/ - Flux source and top-level reconciliation graph.
  • infrastructure/addons/<addon>/ - Flux-managed addon manifests.
  • .gitea/workflows/*.yml - CI/CD entry points and the best reference for expected commands.

Build / Validate / Test

Terraform

  • Format all Terraform: terraform -chdir=terraform fmt -recursive
  • Check formatting: terraform -chdir=terraform fmt -check -recursive
  • Validate config: terraform -chdir=terraform validate
  • Full plan: terraform -chdir=terraform plan -var-file=../terraform.tfvars
  • Apply: terraform -chdir=terraform apply -var-file=../terraform.tfvars
  • Destroy: terraform -chdir=terraform destroy -var-file=../terraform.tfvars

Terraform, single-target / focused checks

  • Plan one resource: terraform -chdir=terraform plan -var-file=../terraform.tfvars -target=hcloud_server.control_plane[0]
  • Import/check existing state: use terraform state list and terraform state show <address> before editing imports.
  • If you touch only Terraform formatting, run terraform fmt -check -recursive first.

Ansible

  • Install collections: ansible-galaxy collection install -r ansible/requirements.yml
  • Generate inventory: cd ansible && python3 generate_inventory.py
  • Syntax check: ansible-playbook -i ansible/inventory.ini ansible/site.yml --syntax-check
  • Dry-run one host: ansible-playbook -i ansible/inventory.ini ansible/site.yml --check --diff -l control_plane[0]
  • Run the bootstrap playbook: ansible-playbook ansible/site.yml
  • Targeted maintenance: ansible-playbook ansible/site.yml -t upgrade or -t reset
  • Dashboards only: ansible-playbook ansible/dashboards.yml

Python

  • Syntax check the inventory generator: python3 -m py_compile ansible/generate_inventory.py
  • If you modify the script, run it after Terraform outputs exist: cd ansible && python3 generate_inventory.py.

Kubernetes / Flux manifests

  • Render a single addon: kubectl kustomize infrastructure/addons/<addon>
  • Render cluster bootstrap objects: kubectl kustomize clusters/prod/flux-system
  • Prefer validating the exact directory you edited, not the whole repo, unless the change is cross-cutting.
  • For Flux changes, verify the relevant Kustomization/HelmRelease/ExternalSecret manifests render cleanly before committing.

Kubeconfig refresh

After a full cluster rebuild, the kubeconfig goes stale (new certs, new IPs). Refresh it with:

  • scripts/refresh-kubeconfig.sh <cp1-public-ip> (preferred)
  • Or manually: ssh -i ~/.ssh/infra root@<cp1-ip> "cat /etc/rancher/k3s/k3s.yaml" | sed 's/127.0.0.1/<cp1-ip>/g' > outputs/kubeconfig
  • The Ansible site.yml Finalize step also rewrites the server address to the public IP during bootstrap.

Code Style

General

  • Match the existing style in adjacent files.
  • Prefer ASCII unless the file already uses Unicode or a Unicode character is necessary.
  • Do not introduce new tools, frameworks, or abstractions unless the repo already uses them.
  • Keep diffs minimal and avoid unrelated cleanup.

Terraform / HCL

  • Use 2-space indentation.
  • Keep terraform {} blocks first, then providers, locals, variables, resources, and outputs in a logical order.
  • Name variables, locals, and resources in snake_case.
  • Keep descriptions on variables and outputs.
  • Mark sensitive values with sensitive = true.
  • Use aligned = formatting when practical; run terraform fmt instead of hand-formatting.
  • Prefer explicit depends_on only when required.
  • Keep logic in locals if it is reused or non-trivial.

Ansible / YAML

  • Use 2-space YAML indentation.
  • Use descriptive task names in sentence case (e.g. Install k3s server).
  • Keep tasks idempotent; use changed_when: false and failed_when: false for probes and checks.
  • Use command/shell only when a dedicated module is not a better fit.
  • Use shell only when you need pipes, redirection, heredocs, or shell expansion.
  • Prefer when guards and default(...) filters over duplicating tasks.
  • Keep role names and file names kebab-case; keep variables snake_case.
  • For multi-line shell snippets in workflows or tasks, use set -e or set -euo pipefail when the command sequence should fail fast.

Kubernetes / Flux YAML

  • Keep one Kubernetes object per file unless the repo already groups a small set of tightly related objects.
  • Use kebab-case filenames that match the repo pattern (helmrelease-*.yaml, kustomization-*.yaml, *-externalsecret.yaml).
  • Keep addon manifests under infrastructure/addons/<addon>/ with a nested kustomization.yaml.
  • Keep Flux graph objects in clusters/prod/flux-system/.
  • Quote strings that contain :, *, cron expressions, or shell-sensitive characters.
  • Preserve existing labels/annotations unless the change specifically needs them.

Python

  • Follow PEP 8 style and keep imports ordered: stdlib, third-party, local.
  • Use snake_case for functions and variables.
  • Keep scripts small and explicit; exit non-zero on failure.
  • Prefer clear subprocess error handling over silent failures.

Editing Practices

  • Read the target file and adjacent patterns before editing.
  • Preserve user changes; do not overwrite unrelated diffs.
  • Prefer apply_patch for small single-file edits.
  • Use scripting only when it is cleaner than repeated manual edits.
  • Keep comments minimal and only add them for non-obvious logic.

Secrets / Security

  • Never commit tokens, passwords, kubeconfigs, private keys, or generated secrets.
  • Use Gitea secrets, Doppler, or External Secrets for runtime secrets.
  • Avoid printing secret values in logs, comments, or commit messages.
  • If you must inspect a secret locally, only verify shape/length or compare values indirectly.

Workflow Expectations

  • Read the target file and nearby patterns before editing.
  • Check git status before and after your changes.
  • Run the narrowest relevant validation command after edits.
  • If you make a live-cluster workaround, also update the declarative manifests so Flux can own it.
  • Do not overwrite user changes you did not make.
  • If a change spans Terraform + Ansible + Flux, update and verify each layer separately.

CI / Workflow Notes

  • CI currently uses .gitea/workflows/deploy.yml, .gitea/workflows/destroy.yml, and .gitea/workflows/dashboards.yml as the canonical automation references.
  • The workflows run terraform fmt -check -recursive, terraform validate, Terraform plan/apply, Ansible bootstrap, and targeted Flux bootstrap steps.
  • If you change workflow behavior, keep the repo docs and the workflow commands in sync.

Cursor / Copilot Rules

  • No .cursor/rules/, .cursorrules, or .github/copilot-instructions.md files were present when this file was created.
  • If those files are added later, mirror their guidance here and treat them as authoritative.