The Problem with Deployments
Deployments are designed for long-running services:
- Continuously running Pods
- Self-healing if containers crash
- Scaling based on demand
But what if you need to:
- Run a one-time database migration?
- Process a batch of files once and exit?
- Clean up old data daily?
For these, use Jobs and CronJobs.
Jobs
A Job creates Pods that run until completion, then stop.
Basic Job
apiVersion: batch/v1
kind: Job
metadata:
name: db-migration
spec:
template:
spec:
containers:
- name: migration
image: myapp:1.0
command: ["/app/migrate.sh"]
restartPolicy: Never # Don't restart on failure
backoffLimit: 3 # Try up to 3 times before giving up
ttlSecondsAfterFinished: 3600 # Delete job 1 hour after completionRun a Job
kubectl apply -f job.yaml
# Check status
kubectl get jobs
kubectl describe job db-migration
kubectl logs job/db-migration
# Delete job
kubectl delete job db-migrationJob States
Job created
↓
Pod created → Running
↓
Container completes with exit code 0 → Job succeeds
↓
(After ttlSecondsAfterFinished) → Job deleted
Or:
Container exits with non-zero code
↓
If attempts < backoffLimit → Restart pod
↓
If attempts >= backoffLimit → Job fails
Parallel Jobs
Run multiple Pods in parallel:
apiVersion: batch/v1
kind: Job
metadata:
name: batch-process
spec:
parallelism: 5 # Run 5 Pods simultaneously
completions: 100 # Total of 100 Pods needed to complete
backoffLimit: 3
template:
spec:
containers:
- name: processor
image: batch-processor:1.0
restartPolicy: NeverHow it works:
- Start 5 Pods in parallel
- When one finishes, start another
- Keep going until 100 Pods have succeeded
Work Queue Pattern
For distributed work:
apiVersion: batch/v1
kind: Job
metadata:
name: work-queue-job
spec:
parallelism: 10
completions: 10
backoffLimit: 3
template:
spec:
containers:
- name: worker
image: worker:latest
env:
- name: QUEUE_URL
value: "rabbitmq-service:5672"
restartPolicy: NeverEach Pod:
- Connects to a work queue (Redis, RabbitMQ, etc.)
- Pulls a task
- Processes it
- Exits
CronJobs
A CronJob runs a Job on a schedule (like Linux crontab).
Basic CronJob
apiVersion: batch/v1
kind: CronJob
metadata:
name: daily-backup
spec:
schedule: "0 2 * * *" # 2 AM every day (UTC)
jobTemplate:
spec:
template:
spec:
containers:
- name: backup
image: backup-tool:latest
command: ["/app/backup.sh"]
volumeMounts:
- name: backup-volume
mountPath: /backups
volumes:
- name: backup-volume
persistentVolumeClaim:
claimName: backup-storage
restartPolicy: OnFailure
backoffLimit: 1Cron Schedule Format
┌─────────────────── minute (0 - 59)
│ ┌───────────────── hour (0 - 23)
│ │ ┌─────────────── day of month (1 - 31)
│ │ │ ┌───────────── month (1 - 12)
│ │ │ │ ┌─────────── day of week (0 - 6) (Sunday to Saturday)
│ │ │ │ │
│ │ │ │ │
* * * * *
Common Schedules
schedule: "0 0 * * *" # Every day at midnight
schedule: "0 2 * * *" # Every day at 2 AM
schedule: "0 * * * *" # Every hour
schedule: "*/15 * * * *" # Every 15 minutes
schedule: "0 0 * * 0" # Every Sunday at midnight
schedule: "0 0 1 * *" # First day of every month
schedule: "0 0 * * 1-5" # Weekdays at midnight
schedule: "0 9,17 * * *" # At 9 AM and 5 PMTimezone Support
apiVersion: batch/v1
kind: CronJob
metadata:
name: localized-job
spec:
schedule: "0 9 * * *"
timeZone: "America/New_York" # Run at 9 AM EST
jobTemplate:
spec:
template:
spec:
containers:
- name: job
image: job:latest
restartPolicy: OnFailureManaging CronJob History
spec:
successfulJobsHistoryLimit: 3 # Keep last 3 successful jobs
failedJobsHistoryLimit: 1 # Keep last 1 failed job
concurrencyPolicy: Forbid # Don't run if previous job still running
# Or:
concurrencyPolicy: Replace # Cancel previous job and start new oneJob Patterns
Pattern 1: One-Time Task
apiVersion: batch/v1
kind: Job
metadata:
name: one-time-task
spec:
template:
spec:
containers:
- name: task
image: ubuntu
command: ["echo", "Hello World"]
restartPolicy: NeverPattern 2: Database Migration
apiVersion: batch/v1
kind: Job
metadata:
name: django-migrate
spec:
template:
spec:
serviceAccountName: django-sa
containers:
- name: migrate
image: myapp:2.0
command: ["python", "manage.py", "migrate"]
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: db-secret
key: url
restartPolicy: Never
backoffLimit: 3Pattern 3: Daily Report Generation
apiVersion: batch/v1
kind: CronJob
metadata:
name: daily-report
spec:
schedule: "0 6 * * *" # 6 AM daily
jobTemplate:
spec:
template:
spec:
serviceAccountName: report-generator
containers:
- name: report
image: reporting:latest
env:
- name: REPORT_DATE
value: "{{ .ScheduledTime }}"
restartPolicy: OnFailurePattern 4: Batch Data Processing
apiVersion: batch/v1
kind: Job
metadata:
name: image-processor
spec:
parallelism: 10
completions: 100
template:
spec:
containers:
- name: processor
image: image-processor:latest
volumeMounts:
- name: images
mountPath: /input
- name: output
mountPath: /output
volumes:
- name: images
persistentVolumeClaim:
claimName: images-pvc
- name: output
persistentVolumeClaim:
claimName: output-pvc
restartPolicy: OnFailureMonitoring Jobs
# List all jobs
kubectl get jobs
# View job details
kubectl describe job db-migration
# View job logs
kubectl logs job/db-migration
kubectl logs job/db-migration --all-containers=true
# Watch job progress
kubectl get jobs -w
# View CronJob history
kubectl get cronjobs
kubectl get jobs -l cronjob=daily-backupBest Practices
✅ Set backoffLimit appropriately
- Prevent infinite retry loops
- Balance reliability with cost
✅ Use ttlSecondsAfterFinished
- Clean up completed jobs automatically
- Keeps cluster tidy
✅ Monitor job completion
kubectl get jobs -w
kubectl wait --for=condition=complete job/my-job✅ Log job output
kubectl logs job/my-job > job-output.log✅ Handle failures gracefully Check exit codes:
if [ $? -ne 0 ]; then
echo "Job failed"
exit 1
fi❌ Don't use "sleep" for scheduling Use CronJobs instead of Jobs that sleep
❌ Don't create jobs directly from command line Use YAML manifests for reproducibility