fix: add local registry cache for rebuilds
Deploy Cluster / Terraform (push) Successful in 4m7s
Deploy Cluster / Ansible (push) Failing after 16m31s

This commit is contained in:
2026-05-03 00:02:33 +00:00
parent 8375333ac5
commit 1896108cbb
9 changed files with 334 additions and 11 deletions
+36
View File
@@ -273,6 +273,42 @@ kubectl -n observability describe svc prometheus-tailscale | grep TailscaleProxy
If local `kubectl` falls back to `localhost:8080`, refresh `outputs/kubeconfig` with `scripts/refresh-kubeconfig.sh 10.27.27.30`.
## Network Stabilization Probes
Run the same probe from the Proxmox host, `cp1`, and one worker when registry pulls or Doppler calls flap:
```bash
scripts/network-stabilization-probe.sh
```
From the generated Ansible inventory:
```bash
cd ansible
ansible -i inventory.ini 'control_plane[0]' -m script -a '../scripts/network-stabilization-probe.sh'
ansible -i inventory.ini 'workers[0]' -m script -a '../scripts/network-stabilization-probe.sh'
```
Use `NETWORK_PROBE_REPEAT_COUNT`, `NETWORK_PROBE_CURL_TIMEOUT`, and `NETWORK_PROBE_PULL_TIMEOUT` to tune probe duration.
## Registry Cache
K3s nodes are configured by Ansible to use the Proxmox host as a local pull-through cache for common upstream registries. The cache listens on `10.27.27.239`:
```text
docker.io -> http://10.27.27.239:5000
ghcr.io -> http://10.27.27.239:5001
quay.io -> http://10.27.27.239:5002
registry.k8s.io -> http://10.27.27.239:5003
oci.external-secrets.io -> http://10.27.27.239:5004
```
Bootstrap or repair the cache on Proxmox with:
```bash
ssh -i ~/.ssh/infra root@10.27.27.239 'bash -s' < scripts/setup-proxmox-registry-cache.sh
```
## Security Notes
- Never commit `terraform.tfvars`, kubeconfigs, private keys, `outputs/`, or real secret values.