Skip to content

Jobs & CronJobs

Jobs run to completion and exit. Use a Job for one-shot batch processing (data transforms, migrations, report generation) and a CronJob for scheduled recurring work (nightly reports, periodic cleanup). Both shapes fit naturally into the GitOps workflow — commit the manifest, let ArgoCD sync it, and the cluster handles scheduling.

Before using this recipe, confirm:

  • Your tenant is provisioned and you have completed Your first deployment end-to-end.
  • Your workload repo is set up per Your repo, your workloads — ArgoCD is pointed at your repo and syncing.
  • You know your tenant name (the <your-tenant> prefix used in namespace names).
  • Your PVCs are provisioned in the target namespace. PVC creation is covered in Persistent volumes in practice.

Use a Job when your workload:

  • Runs to completion and exits (does not serve traffic).
  • Needs to process data from a PVC, generate output to a PVC, or perform a one-time operation.
  • Has a deadline and you want it scheduled promptly (uses uber-user-preempt-medium — willing to preempt opportunistic peers).

Use a CronJob when your workload:

  • Runs on a recurring schedule (nightly, hourly, weekly).
  • Is retry-tolerant — if preempted tonight, it re-runs tomorrow.
  • Writes output to an in-cluster destination (another PVC, a Service endpoint in the same tenant).

If your workload runs continuously and serves traffic, see Long-running service. If you need a short-lived interactive environment, see Dev pod.

All manifests go in your workload repo under a directory that your ArgoCD Application’s spec.source.path points at (e.g. manifests/batch/). Replace every <your-tenant> placeholder with your real tenant name.

The minimal variant runs a single Python container that reads from an input PVC and writes results to an output PVC. Uses uber-user-preempt-medium priority — deadline-bound batch willing to preempt opportunistic peers.

manifests/batch/job.yaml
apiVersion: batch/v1
kind: Job
metadata:
name: process-2026-q2
namespace: <your-tenant>-batch
spec:
backoffLimit: 2
template:
metadata:
labels:
job: process-2026-q2
spec:
restartPolicy: OnFailure
priorityClassName: uber-user-preempt-medium
securityContext:
runAsNonRoot: true
runAsUser: 1000
seccompProfile:
type: RuntimeDefault
containers:
- name: worker
image: python:3.12-slim
command: ['python', '/app/process.py']
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop: [ALL]
resources:
requests:
cpu: 500m
memory: 1Gi
limits:
cpu: 2
memory: 4Gi
volumeMounts:
- name: input
mountPath: /data/input
readOnly: true
- name: output
mountPath: /data/output
- name: tmp
mountPath: /tmp
- name: app
mountPath: /app
readOnly: true
volumes:
- name: input
persistentVolumeClaim:
claimName: input-data
- name: output
persistentVolumeClaim:
claimName: output-data
- name: tmp
emptyDir: {}
- name: app
configMap:
name: process-script

This CronJob runs a Python script at 03:00 UTC daily that reads from a data PVC and writes a report to an output PVC in the same tenant namespace. The cluster runs on UTC — all schedule values are interpreted as UTC, not your local timezone.

Uses uber-user-significant priority — opportunistic scheduling. If the cluster is under pressure and a higher-priority pod preempts this CronJob’s pod, it simply re-runs at the next scheduled time.

manifests/batch/cronjob.yaml
apiVersion: batch/v1
kind: CronJob
metadata:
name: nightly-report
namespace: <your-tenant>-batch
spec:
schedule: "0 3 * * *"
timeZone: "UTC"
concurrencyPolicy: Forbid
successfulJobsHistoryLimit: 3
failedJobsHistoryLimit: 1
jobTemplate:
spec:
backoffLimit: 1
activeDeadlineSeconds: 1800
template:
spec:
restartPolicy: OnFailure
priorityClassName: uber-user-significant
securityContext:
runAsNonRoot: true
runAsUser: 1000
seccompProfile:
type: RuntimeDefault
containers:
- name: reporter
image: python:3.12-slim
command: ['python', '/app/report.py']
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop: [ALL]
resources:
requests:
cpu: 250m
memory: 512Mi
limits:
cpu: 1
memory: 2Gi
volumeMounts:
- name: data
mountPath: /data/input
readOnly: true
- name: reports
mountPath: /data/reports
- name: tmp
mountPath: /tmp
- name: app
mountPath: /app
readOnly: true
volumes:
- name: data
persistentVolumeClaim:
claimName: tenant-data
- name: reports
persistentVolumeClaim:
claimName: report-output
- name: tmp
emptyDir: {}
- name: app
configMap:
name: report-script

Before committing, validate your manifests locally:

Terminal window
kubectl apply --dry-run=client -f manifests/batch/job.yaml
kubectl apply --dry-run=client -f manifests/batch/cronjob.yaml

Both should render without errors. This catches YAML syntax issues and missing required fields before ArgoCD tries to apply them.

After pushing to your repo and letting ArgoCD sync:

For the Job:

Terminal window
kubectl get jobs -n <your-tenant>-batch
kubectl get pods -n <your-tenant>-batch -l job=process-2026-q2
kubectl logs -n <your-tenant>-batch -l job=process-2026-q2

Expected:

  • The Job shows 1/1 under COMPLETIONS when finished.
  • The Pod is Completed (not Running or Error).
  • Logs show your processing output.

For the CronJob:

Terminal window
kubectl get cronjobs -n <your-tenant>-batch
kubectl get jobs -n <your-tenant>-batch
kubectl logs -n <your-tenant>-batch -l job-name=nightly-report-<timestamp>

Expected:

  • The CronJob shows SCHEDULE as 0 3 * * * and ACTIVE as 0 (between runs) or 1 (during a run).
  • Child Jobs appear after the first scheduled trigger, showing 1/1 COMPLETIONS on success.
  • Logs show your report output.
  • Pod rejected by Kyverno with PSS admission error. You modified the securityContext block. Revert to the verbatim YAML above. See Known limitations for the full PSS rule.
  • Namespace rejected with forceTenantPrefix. The namespace name does not start with your tenant name. All namespaces must be <your-tenant>-<suffix>. See Known limitations.
  • Job stuck Pending — ResourcePool exhaustion. Your tenant’s resource quota is shared across all your namespaces. Other workloads in <your-tenant>-prod or <your-tenant>-dev may be consuming the quota, leaving nothing for <your-tenant>-batch. Check allocation with kubectl describe resourcepool (it is cluster-scoped — -n is ignored); for one namespace’s usage use kubectl describe resourcequota -n <your-tenant>-batch. See Resource pools and quotas for details.
  • Job stuck Pending — no preemptible pods found. uber-user-preempt-medium can only preempt uber-user-significant pods. If no significant pods are running cluster-wide, there is nothing to preempt and the Job waits for capacity to free up naturally. See Priority classes for the preemption matrix.
  • CronJob child Job fails with connection timed out to external endpoint. Egress is default-deny on Kestrel — reaching any external endpoint requires a tenant-scoped NetworkPolicy with an explicit egress rule for that destination. Confirm such a rule exists and covers the endpoint; without one, outbound traffic to anything beyond DNS and intra-tenant services is blocked. See NetworkPolicy in practice.
  • CronJob never triggers. Check that the schedule is valid cron syntax and remember it is UTC. kubectl describe cronjob -n <your-tenant>-batch nightly-report shows Last Schedule Time and any scheduling errors.
  • ArgoCD shows OutOfSync but no errors. ArgoCD polls on a 3-minute interval. Click Sync in the ArgoCD UI for an immediate reconcile.
  • Need to make a temporary change outside GitOps? Use the escape hatch workflow — but remember that ArgoCD reverts any mutation on the next sync.