fix: pre-pull Rancher images and reset Rancher release during bootstrap
Rancher installs were stalling on transient Docker Hub TLS handshake timeouts for rancher shell, webhook, and system-upgrade-controller images. Pre-pull the required images onto all nodes after k3s comes up, extend the Rancher HelmRelease timeout, and reset/force the Rancher HelmRelease before waiting on addon-rancher so bootstrap can recover from stale failed remediation state.
This commit is contained in:
@@ -259,9 +259,16 @@ jobs:
|
||||
KUBECONFIG: outputs/kubeconfig
|
||||
run: |
|
||||
set -euo pipefail
|
||||
TS=$(date --iso-8601=seconds)
|
||||
kubectl -n flux-system annotate helmrelease/rancher \
|
||||
reconcile.fluxcd.io/requestedAt="$TS" \
|
||||
reconcile.fluxcd.io/resetAt="$TS" \
|
||||
reconcile.fluxcd.io/forceAt="$TS" \
|
||||
--overwrite || true
|
||||
|
||||
echo "Waiting for Rancher..."
|
||||
kubectl -n flux-system wait --for=condition=Ready kustomization/addon-rancher --timeout=600s
|
||||
kubectl -n flux-system wait --for=condition=Ready helmrelease/rancher -n flux-system --timeout=300s
|
||||
kubectl -n flux-system wait --for=condition=Ready helmrelease/rancher --timeout=900s
|
||||
kubectl -n flux-system wait --for=condition=Ready kustomization/addon-rancher --timeout=900s
|
||||
|
||||
echo "Waiting for rancher-backup operator..."
|
||||
kubectl -n flux-system wait --for=condition=Ready kustomization/addon-rancher-backup --timeout=600s || true
|
||||
|
||||
Reference in New Issue
Block a user