docs: Add agent guidance and sync Rancher docs
This commit is contained in:
149
AGENTS.md
Normal file
149
AGENTS.md
Normal file
@@ -0,0 +1,149 @@
|
||||
# AGENTS.md
|
||||
|
||||
Repository guide for agentic contributors working in this repo.
|
||||
|
||||
## Scope
|
||||
|
||||
- This is an infrastructure repository for a Hetzner + k3s + Flux stack.
|
||||
- Primary areas: `terraform/`, `ansible/`, `clusters/`, `infrastructure/`, `.gitea/workflows/`.
|
||||
- Treat `README.md` and `STABLE_BASELINE.md` as user-facing context, but prefer the repo's current manifests and workflows as the source of truth.
|
||||
- Keep changes small and reviewable; prefer the narrowest file set that solves the task.
|
||||
|
||||
## Current Tooling
|
||||
|
||||
- Terraform for cloud infra and state-backed provisioning.
|
||||
- Ansible for bootstrap, OS prep, k3s install, and pre-Flux prerequisites.
|
||||
- Flux/Kustomize for cluster and addon reconciliation.
|
||||
- Python for inventory generation (`ansible/generate_inventory.py`).
|
||||
|
||||
## Important Files
|
||||
|
||||
- `terraform/main.tf` - provider and version pins.
|
||||
- `terraform/variables.tf` - input surface and defaults.
|
||||
- `terraform/*.tf` - Hetzner network, firewall, servers, SSH, outputs.
|
||||
- `ansible/site.yml` - ordered bootstrap playbook.
|
||||
- `ansible/generate_inventory.py` - renders `ansible/inventory.ini` from Terraform outputs.
|
||||
- `clusters/prod/flux-system/` - Flux source and top-level reconciliation graph.
|
||||
- `infrastructure/addons/<addon>/` - Flux-managed addon manifests.
|
||||
- `.gitea/workflows/*.yml` - CI/CD entry points and the best reference for expected commands.
|
||||
|
||||
## Build / Validate / Test
|
||||
|
||||
### Terraform
|
||||
|
||||
- Format all Terraform: `terraform -chdir=terraform fmt -recursive`
|
||||
- Check formatting: `terraform -chdir=terraform fmt -check -recursive`
|
||||
- Validate config: `terraform -chdir=terraform validate`
|
||||
- Full plan: `terraform -chdir=terraform plan -var-file=../terraform.tfvars`
|
||||
- Apply: `terraform -chdir=terraform apply -var-file=../terraform.tfvars`
|
||||
- Destroy: `terraform -chdir=terraform destroy -var-file=../terraform.tfvars`
|
||||
|
||||
### Terraform, single-target / focused checks
|
||||
|
||||
- Plan one resource: `terraform -chdir=terraform plan -var-file=../terraform.tfvars -target=hcloud_server.control_plane[0]`
|
||||
- Import/check existing state: use `terraform state list` and `terraform state show <address>` before editing imports.
|
||||
- If you touch only Terraform formatting, run `terraform fmt -check -recursive` first.
|
||||
|
||||
### Ansible
|
||||
|
||||
- Install collections: `ansible-galaxy collection install -r ansible/requirements.yml`
|
||||
- Generate inventory: `cd ansible && python3 generate_inventory.py`
|
||||
- Syntax check: `ansible-playbook -i ansible/inventory.ini ansible/site.yml --syntax-check`
|
||||
- Dry-run one host: `ansible-playbook -i ansible/inventory.ini ansible/site.yml --check --diff -l control_plane[0]`
|
||||
- Run the bootstrap playbook: `ansible-playbook ansible/site.yml`
|
||||
- Targeted maintenance: `ansible-playbook ansible/site.yml -t upgrade` or `-t reset`
|
||||
- Dashboards only: `ansible-playbook ansible/dashboards.yml`
|
||||
|
||||
### Python
|
||||
|
||||
- Syntax check the inventory generator: `python3 -m py_compile ansible/generate_inventory.py`
|
||||
- If you modify the script, run it after Terraform outputs exist: `cd ansible && python3 generate_inventory.py`.
|
||||
|
||||
### Kubernetes / Flux manifests
|
||||
|
||||
- Render a single addon: `kubectl kustomize infrastructure/addons/<addon>`
|
||||
- Render cluster bootstrap objects: `kubectl kustomize clusters/prod/flux-system`
|
||||
- Prefer validating the exact directory you edited, not the whole repo, unless the change is cross-cutting.
|
||||
- For Flux changes, verify the relevant `Kustomization`/`HelmRelease`/`ExternalSecret` manifests render cleanly before committing.
|
||||
|
||||
## Code Style
|
||||
|
||||
### General
|
||||
|
||||
- Match the existing style in adjacent files.
|
||||
- Prefer ASCII unless the file already uses Unicode or a Unicode character is necessary.
|
||||
- Do not introduce new tools, frameworks, or abstractions unless the repo already uses them.
|
||||
- Keep diffs minimal and avoid unrelated cleanup.
|
||||
|
||||
### Terraform / HCL
|
||||
|
||||
- Use 2-space indentation.
|
||||
- Keep `terraform {}` blocks first, then providers, locals, variables, resources, and outputs in a logical order.
|
||||
- Name variables, locals, and resources in `snake_case`.
|
||||
- Keep descriptions on variables and outputs.
|
||||
- Mark sensitive values with `sensitive = true`.
|
||||
- Use aligned `=` formatting when practical; run `terraform fmt` instead of hand-formatting.
|
||||
- Prefer explicit `depends_on` only when required.
|
||||
- Keep logic in `locals` if it is reused or non-trivial.
|
||||
|
||||
### Ansible / YAML
|
||||
|
||||
- Use 2-space YAML indentation.
|
||||
- Use descriptive task names in sentence case (e.g. `Install k3s server`).
|
||||
- Keep tasks idempotent; use `changed_when: false` and `failed_when: false` for probes and checks.
|
||||
- Use `command`/`shell` only when a dedicated module is not a better fit.
|
||||
- Use `shell` only when you need pipes, redirection, heredocs, or shell expansion.
|
||||
- Prefer `when` guards and `default(...)` filters over duplicating tasks.
|
||||
- Keep role names and file names kebab-case; keep variables snake_case.
|
||||
- For multi-line shell snippets in workflows or tasks, use `set -e` or `set -euo pipefail` when the command sequence should fail fast.
|
||||
|
||||
### Kubernetes / Flux YAML
|
||||
|
||||
- Keep one Kubernetes object per file unless the repo already groups a small set of tightly related objects.
|
||||
- Use kebab-case filenames that match the repo pattern (`helmrelease-*.yaml`, `kustomization-*.yaml`, `*-externalsecret.yaml`).
|
||||
- Keep addon manifests under `infrastructure/addons/<addon>/` with a nested `kustomization.yaml`.
|
||||
- Keep Flux graph objects in `clusters/prod/flux-system/`.
|
||||
- Quote strings that contain `:`, `*`, cron expressions, or shell-sensitive characters.
|
||||
- Preserve existing labels/annotations unless the change specifically needs them.
|
||||
|
||||
### Python
|
||||
|
||||
- Follow PEP 8 style and keep imports ordered: stdlib, third-party, local.
|
||||
- Use `snake_case` for functions and variables.
|
||||
- Keep scripts small and explicit; exit non-zero on failure.
|
||||
- Prefer clear subprocess error handling over silent failures.
|
||||
|
||||
## Editing Practices
|
||||
|
||||
- Read the target file and adjacent patterns before editing.
|
||||
- Preserve user changes; do not overwrite unrelated diffs.
|
||||
- Prefer `apply_patch` for small single-file edits.
|
||||
- Use scripting only when it is cleaner than repeated manual edits.
|
||||
- Keep comments minimal and only add them for non-obvious logic.
|
||||
|
||||
## Secrets / Security
|
||||
|
||||
- Never commit tokens, passwords, kubeconfigs, private keys, or generated secrets.
|
||||
- Use Gitea secrets, Doppler, or External Secrets for runtime secrets.
|
||||
- Avoid printing secret values in logs, comments, or commit messages.
|
||||
- If you must inspect a secret locally, only verify shape/length or compare values indirectly.
|
||||
|
||||
## Workflow Expectations
|
||||
|
||||
- Read the target file and nearby patterns before editing.
|
||||
- Check `git status` before and after your changes.
|
||||
- Run the narrowest relevant validation command after edits.
|
||||
- If you make a live-cluster workaround, also update the declarative manifests so Flux can own it.
|
||||
- Do not overwrite user changes you did not make.
|
||||
- If a change spans Terraform + Ansible + Flux, update and verify each layer separately.
|
||||
|
||||
## CI / Workflow Notes
|
||||
|
||||
- CI currently uses `.gitea/workflows/deploy.yml`, `.gitea/workflows/destroy.yml`, and `.gitea/workflows/dashboards.yml` as the canonical automation references.
|
||||
- The workflows run `terraform fmt -check -recursive`, `terraform validate`, Terraform plan/apply, Ansible bootstrap, and targeted Flux bootstrap steps.
|
||||
- If you change workflow behavior, keep the repo docs and the workflow commands in sync.
|
||||
|
||||
## Cursor / Copilot Rules
|
||||
|
||||
- No `.cursor/rules/`, `.cursorrules`, or `.github/copilot-instructions.md` files were present when this file was created.
|
||||
- If those files are added later, mirror their guidance here and treat them as authoritative.
|
||||
@@ -11,7 +11,7 @@ Production-ready Kubernetes cluster on Hetzner Cloud using Terraform and Ansible
|
||||
| **Total Cost** | €28.93/mo |
|
||||
| **K8s** | k3s (latest, HA) |
|
||||
| **Addons** | Hetzner CCM + CSI + Prometheus + Grafana + Loki |
|
||||
| **Access** | SSH/API restricted to Tailnet |
|
||||
| **Access** | SSH/API and Rancher UI restricted to Tailnet |
|
||||
| **Bootstrap** | Terraform + Ansible |
|
||||
|
||||
### Cluster Resources
|
||||
@@ -239,6 +239,12 @@ Terraform/bootstrap secrets remain in Gitea Actions secrets and are not managed
|
||||
- Ansible is limited to cluster bootstrap, private-access setup, and prerequisite secret creation for Flux-managed addons.
|
||||
- `addon-flux-ui` is optional for the stable-baseline phase and is not a blocker for rebuild success.
|
||||
|
||||
### Rancher access
|
||||
|
||||
- Rancher is private-only and exposed through Tailscale at `https://rancher.silverside-gopher.ts.net/dashboard/`.
|
||||
- The public Hetzner load balancer path is not used for Rancher.
|
||||
- Rancher uses the CNPG-backed PostgreSQL cluster in `cnpg-cluster`.
|
||||
|
||||
### Stable baseline acceptance
|
||||
|
||||
A rebuild is considered successful only when all of the following pass without manual intervention:
|
||||
|
||||
@@ -9,6 +9,7 @@ This document defines the current engineering target for this repository.
|
||||
- Hetzner Load Balancer for Kubernetes API
|
||||
- private Hetzner network
|
||||
- Tailscale operator access
|
||||
- Rancher UI exposed only through Tailscale
|
||||
|
||||
## In Scope
|
||||
|
||||
@@ -48,7 +49,7 @@ This document defines the current engineering target for this repository.
|
||||
9. **CSI deploys and creates `hcloud-volumes` StorageClass**.
|
||||
10. **PVC provisioning tested and working**.
|
||||
11. External Secrets sync required secrets.
|
||||
12. Tailscale private access works.
|
||||
12. Tailscale private access works, including Rancher UI access.
|
||||
13. Terraform destroy succeeds cleanly or via workflow retry.
|
||||
|
||||
## Success Criteria
|
||||
|
||||
Reference in New Issue
Block a user