Triage

When something breaks on Kestrel, start here. This page has four branches — one per common stuck-state — and each branch points at the canonical page to fix it. If you know the specific error message already, the troubleshooting index is a flat search index. Otherwise, pick the branch that matches your symptom.

Symptom: kubectl get ns prints error: You must be logged in to the server (Unauthorized), or your browser does not open, or the browser opens but returns a Keycloak error page.

First place to look: Install kubelogin troubleshooting. The three most common failures are a port conflict on 8000, a stale OIDC state cache, and a missing GUI browser on WSL2 or SSH sessions — all covered there.

If the troubleshooting section does not resolve it, your tenant group claim may not match the Capsule Tenant owner. Check that you can see your tenant’s namespace with kubectl get ns | grep <your-tenant>. If the list is empty, see Identity chain and then open a ticket if the mismatch is real.

ArgoCD sync fails

Symptom: you pushed to your repo, the ArgoCD UI shows the new Application, but it is OutOfSync, Degraded, or Unknown. The sync message names a specific error.

First place to look: the ArgoCD UI sync diff — click into your Application, then the failing resource, then the sync message. The most common failures are a YAML validation error, a Kyverno Pod Security rejection (missing runAsNonRoot, wrong seccompProfile, missing capabilities.drop: [ALL]), a Capsule namespace-prefix rejection, or a priority class not in the allowlist.

Cross-reference: see Known limitations for PSS, namespace prefix enforcement, and priority class allowlist. The first-deployment walkthrough “If something goes wrong” section has a shorter list tied to that specific walkthrough.

Ingress 404 or TLS error

Symptom: your Pod is Running, your Service is ClusterIP, your Ingress has a hostname, but visiting the hostname in a browser returns 404, “connection refused”, or a TLS warning.

First place to look: kubectl describe ingress -n <your-tenant>-<namespace> to confirm the ingressClassName: traefik annotation and the tls.secretName are set. Then kubectl get certificate -n <your-tenant>-<namespace> to check whether cert-manager has issued the cert (the first issue takes about 30 seconds). A 404 with a valid cert usually means the Service selector does not match the Pod labels — double-check spec.selector on the Service and spec.template.metadata.labels on the Deployment.

Deeper context: Ingress on Kestrel has the full Traefik + cert-manager recipe. See also Network model for why LoadBalancer does not work and Known limitations for the rationale.

Storage or quota surprise

Symptom: a PVC is stuck in Pending, a Pod fails to schedule with Insufficient cpu or Insufficient memory, or an ArgoCD sync fails with a quota-exceeded error.

First place to look: kubectl describe resourcepool (it is cluster-scoped — -n is ignored) to see your tenant’s current quota usage versus its allocation. The Resource pools and quotas page has the tier numbers.

If it is a PVC issue rather than a CPU or memory issue: kubectl describe pvc -n <your-tenant>-<namespace>. The default storage class is csi-cinder-sc-delete (ReadWriteOnce only); ReadWriteMany (shared) volumes are not offered through a storage class — open a ticket with RCS to have a per-tenant NFS volume provisioned. A PVC naming any other storage class is rejected at admission and stays Pending. Deeper storage context is in Storage classes. See Known limitations for why one namespace’s runaway can eat the whole tenant’s quota.

None of the above helped? Open a ticket with RCS. Include: the command you ran, the exact error message, the tenant name, the namespace, and the approximate time the failure started. See Support for the canonical “how to open a ticket” page.

Triage

kubelogin login fails

ArgoCD sync fails

Ingress 404 or TLS error

Storage or quota surprise