- scripts/refresh-kubeconfig.sh fetches a fresh kubeconfig from CP1 - Ansible site.yml Finalize step now uses public IP instead of Tailscale hostname for the kubeconfig server address - Updated AGENTS.md with kubeconfig refresh instructions
7.8 KiB
7.8 KiB
AGENTS.md
Repository guide for agentic contributors working in this repo.
Scope
- This is an infrastructure repository for a Hetzner + k3s + Flux stack.
- Primary areas:
terraform/,ansible/,clusters/,infrastructure/,.gitea/workflows/. - Treat
README.mdandSTABLE_BASELINE.mdas user-facing context, but prefer the repo's current manifests and workflows as the source of truth. - Keep changes small and reviewable; prefer the narrowest file set that solves the task.
Current Tooling
- Terraform for cloud infra and state-backed provisioning.
- Ansible for bootstrap, OS prep, k3s install, and pre-Flux prerequisites.
- Flux/Kustomize for cluster and addon reconciliation.
- Python for inventory generation (
ansible/generate_inventory.py).
Important Files
terraform/main.tf- provider and version pins.terraform/variables.tf- input surface and defaults.terraform/*.tf- Hetzner network, firewall, servers, SSH, outputs.ansible/site.yml- ordered bootstrap playbook.ansible/generate_inventory.py- rendersansible/inventory.inifrom Terraform outputs.clusters/prod/flux-system/- Flux source and top-level reconciliation graph.infrastructure/addons/<addon>/- Flux-managed addon manifests..gitea/workflows/*.yml- CI/CD entry points and the best reference for expected commands.
Build / Validate / Test
Terraform
- Format all Terraform:
terraform -chdir=terraform fmt -recursive - Check formatting:
terraform -chdir=terraform fmt -check -recursive - Validate config:
terraform -chdir=terraform validate - Full plan:
terraform -chdir=terraform plan -var-file=../terraform.tfvars - Apply:
terraform -chdir=terraform apply -var-file=../terraform.tfvars - Destroy:
terraform -chdir=terraform destroy -var-file=../terraform.tfvars
Terraform, single-target / focused checks
- Plan one resource:
terraform -chdir=terraform plan -var-file=../terraform.tfvars -target=hcloud_server.control_plane[0] - Import/check existing state: use
terraform state listandterraform state show <address>before editing imports. - If you touch only Terraform formatting, run
terraform fmt -check -recursivefirst.
Ansible
- Install collections:
ansible-galaxy collection install -r ansible/requirements.yml - Generate inventory:
cd ansible && python3 generate_inventory.py - Syntax check:
ansible-playbook -i ansible/inventory.ini ansible/site.yml --syntax-check - Dry-run one host:
ansible-playbook -i ansible/inventory.ini ansible/site.yml --check --diff -l control_plane[0] - Run the bootstrap playbook:
ansible-playbook ansible/site.yml - Targeted maintenance:
ansible-playbook ansible/site.yml -t upgradeor-t reset - Dashboards only:
ansible-playbook ansible/dashboards.yml
Python
- Syntax check the inventory generator:
python3 -m py_compile ansible/generate_inventory.py - If you modify the script, run it after Terraform outputs exist:
cd ansible && python3 generate_inventory.py.
Kubernetes / Flux manifests
- Render a single addon:
kubectl kustomize infrastructure/addons/<addon> - Render cluster bootstrap objects:
kubectl kustomize clusters/prod/flux-system - Prefer validating the exact directory you edited, not the whole repo, unless the change is cross-cutting.
- For Flux changes, verify the relevant
Kustomization/HelmRelease/ExternalSecretmanifests render cleanly before committing.
Kubeconfig refresh
After a full cluster rebuild, the kubeconfig goes stale (new certs, new IPs). Refresh it with:
scripts/refresh-kubeconfig.sh <cp1-public-ip>(preferred)- Or manually:
ssh -i ~/.ssh/infra root@<cp1-ip> "cat /etc/rancher/k3s/k3s.yaml" | sed 's/127.0.0.1/<cp1-ip>/g' > outputs/kubeconfig - The Ansible
site.ymlFinalize step also rewrites the server address to the public IP during bootstrap.
Code Style
General
- Match the existing style in adjacent files.
- Prefer ASCII unless the file already uses Unicode or a Unicode character is necessary.
- Do not introduce new tools, frameworks, or abstractions unless the repo already uses them.
- Keep diffs minimal and avoid unrelated cleanup.
Terraform / HCL
- Use 2-space indentation.
- Keep
terraform {}blocks first, then providers, locals, variables, resources, and outputs in a logical order. - Name variables, locals, and resources in
snake_case. - Keep descriptions on variables and outputs.
- Mark sensitive values with
sensitive = true. - Use aligned
=formatting when practical; runterraform fmtinstead of hand-formatting. - Prefer explicit
depends_ononly when required. - Keep logic in
localsif it is reused or non-trivial.
Ansible / YAML
- Use 2-space YAML indentation.
- Use descriptive task names in sentence case (e.g.
Install k3s server). - Keep tasks idempotent; use
changed_when: falseandfailed_when: falsefor probes and checks. - Use
command/shellonly when a dedicated module is not a better fit. - Use
shellonly when you need pipes, redirection, heredocs, or shell expansion. - Prefer
whenguards anddefault(...)filters over duplicating tasks. - Keep role names and file names kebab-case; keep variables snake_case.
- For multi-line shell snippets in workflows or tasks, use
set -eorset -euo pipefailwhen the command sequence should fail fast.
Kubernetes / Flux YAML
- Keep one Kubernetes object per file unless the repo already groups a small set of tightly related objects.
- Use kebab-case filenames that match the repo pattern (
helmrelease-*.yaml,kustomization-*.yaml,*-externalsecret.yaml). - Keep addon manifests under
infrastructure/addons/<addon>/with a nestedkustomization.yaml. - Keep Flux graph objects in
clusters/prod/flux-system/. - Quote strings that contain
:,*, cron expressions, or shell-sensitive characters. - Preserve existing labels/annotations unless the change specifically needs them.
Python
- Follow PEP 8 style and keep imports ordered: stdlib, third-party, local.
- Use
snake_casefor functions and variables. - Keep scripts small and explicit; exit non-zero on failure.
- Prefer clear subprocess error handling over silent failures.
Editing Practices
- Read the target file and adjacent patterns before editing.
- Preserve user changes; do not overwrite unrelated diffs.
- Prefer
apply_patchfor small single-file edits. - Use scripting only when it is cleaner than repeated manual edits.
- Keep comments minimal and only add them for non-obvious logic.
Secrets / Security
- Never commit tokens, passwords, kubeconfigs, private keys, or generated secrets.
- Use Gitea secrets, Doppler, or External Secrets for runtime secrets.
- Avoid printing secret values in logs, comments, or commit messages.
- If you must inspect a secret locally, only verify shape/length or compare values indirectly.
Workflow Expectations
- Read the target file and nearby patterns before editing.
- Check
git statusbefore and after your changes. - Run the narrowest relevant validation command after edits.
- If you make a live-cluster workaround, also update the declarative manifests so Flux can own it.
- Do not overwrite user changes you did not make.
- If a change spans Terraform + Ansible + Flux, update and verify each layer separately.
CI / Workflow Notes
- CI currently uses
.gitea/workflows/deploy.yml,.gitea/workflows/destroy.yml, and.gitea/workflows/dashboards.ymlas the canonical automation references. - The workflows run
terraform fmt -check -recursive,terraform validate, Terraform plan/apply, Ansible bootstrap, and targeted Flux bootstrap steps. - If you change workflow behavior, keep the repo docs and the workflow commands in sync.
Cursor / Copilot Rules
- No
.cursor/rules/,.cursorrules, or.github/copilot-instructions.mdfiles were present when this file was created. - If those files are added later, mirror their guidance here and treat them as authoritative.