From zero to understanding Kubernetes architecture, pods, deployments, services, and more. Learn by exploring animated diagrams, real YAML files, and hands-on examples.
The control tower and the runways -- how a Kubernetes cluster is organized
Imagine an air traffic control tower at a busy airport. Planes (your applications) need runways, gates, and fuel. The tower doesn't fly the planes -- it directs them, makes sure no two crash, and reroutes when things go wrong.
Kubernetes is that control tower for your applications. You tell it what you want running, and it figures out where and how.
A machine (physical or virtual) where Kubernetes runs your apps. Like a runway at the airport.
A group of nodes working together. If one runway closes, the others keep operating.
The control tower itself. It manages the cluster, watches for failures, and decides where apps run.
You declare what you want (e.g., "run 3 copies of my app"). Kubernetes constantly works to make reality match your declaration. If a copy crashes, it creates a new one automatically.
A cluster has two kinds of nodes. Click any component to learn what it does.
Watch the components work together to deploy a new pod.
The smallest building block -- your app's protective wrapper
When you ship a fragile vase, you don't toss it in a truck bare. You wrap it in bubble wrap, put it in a box, and label it. The vase is your container, and the box with the label is your Pod.
Kubernetes never works with containers directly. Every container runs inside a Pod. A Pod is a single instance of your application -- the smallest object you can create in Kubernetes.
Pods and containers have a 1:1 relationship for your main application. Need more capacity? Create more pods, not more containers inside the same pod.
When traffic grows, Kubernetes adds more pods -- never more containers inside one pod. When a node runs out of space, new pods go to other nodes.
Sometimes a main container needs a helper -- a sidecar. They share the same network (can talk via localhost), the same storage, and the same lifecycle -- if the pod dies, both die.
Your application -- the actual workload (e.g., a web server)
A helper -- collects logs, handles proxying, or fetches config updates
With Docker alone, you'd manually manage networking between helper containers, shared storage, monitoring, and restarts. Kubernetes pods handle all of this automatically.
kubectl run nginx --image=nginxCreate a pod named "nginx" from the nginx imagekubectl get podsList all pods with status and agekubectl get pods -o wideSame + IP address and node infokubectl describe pod <name>Full details -- events, IP, image, container IDkubectl edit pod <name>Edit a running pod's configurationThe safety net that keeps the right number of pods alive
Imagine a security team at a building. The contract says "always 3 guards on duty." If one calls in sick, a replacement is automatically dispatched. If the crowd grows, more guards are added. That's what a ReplicaSet does for your pods.
Pod crashes? A new one is created automatically. Your app stays online.
Distribute traffic across multiple identical pods so no single pod gets overwhelmed.
Change one number (replicas: 5) and Kubernetes adds or removes pods to match.
Both do the same job, but ReplicationController is the older version. ReplicaSet is newer and more powerful.
| Feature | ReplicationController | ReplicaSet |
|---|---|---|
| API Version | v1 | apps/v1 |
| Selector Type | Equality only (=, !=) | Set-based (In, NotIn, Exists) |
| Selector Required? | No (auto-picks from template) | Yes (must specify matchLabels) |
| Status | Legacy -- avoid in new projects | Current -- always use this one |
apiVersion: apps/v1
kind: ReplicaSet
metadata:
name: myapp-replicaset
spec:
replicas: 3
selector:
matchLabels:
type: front-end
template:
metadata:
labels:
type: front-end
spec:
containers:
- name: nginx-container
image: nginx
Use the "apps" API group (ReplicaSets live here)
We're creating a ReplicaSet
Name it "myapp-replicaset" so we can refer to it later
Now the rules:
Always keep exactly 3 pods running
Find pods by their labels:
Look for pods labeled "type: front-end"
Here's the blueprint for each pod:
Label each new pod as "type: front-end" (must match the selector above!)
Each pod runs one container:
Named "nginx-container", using the nginx image
In a cluster with hundreds of pods, labels and selectors are how a ReplicaSet identifies its pods. The selector must match the template's labels -- otherwise the ReplicaSet creates pods it can't find.
kubectl create -f replicaset.yamlCreate a ReplicaSet from a YAML filekubectl get rsList all ReplicaSetskubectl scale rs myapp --replicas=5Scale up to 5 podskubectl describe rs <name>Detailed info about a ReplicaSetkubectl delete rs <name>Delete ReplicaSet and its podsZero-downtime updates and health checks that keep your app bulletproof
Think of renovating a hotel room by room. Guests never notice because there's always a room available. A Deployment does the same with your app -- it updates pods gradually so users never experience downtime.
Manages pod count (always keep N running)
Wraps the ReplicaSet and adds rolling updates, rollbacks, and health checks
Never manage ReplicaSets directly -- always use Deployments
| Strategy | Behavior | Downtime? |
|---|---|---|
| RollingUpdate (default) | Old pods replaced gradually with new ones | No |
| Recreate | All old pods killed first, then new ones created | Yes |
spec:
replicas: 3
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
minReadySeconds: 0
Keep 3 pods running
When updating, do it gradually:
maxSurge 25%: Allow up to 1 extra pod during the update (3 + 1 = 4 max)
maxUnavailable 25%: At most 1 pod can be down at a time (3 - 1 = 2 min)
Don't wait after a new pod starts (0 seconds grace period)
Watch how Kubernetes replaces old pods (red) with new ones (green) -- one at a time, with zero downtime.
Two health checks, two different questions. Think of a probe as a doctor's visit:
If it fails: Kubernetes restarts the container (inside the same pod). Like a defibrillator.
If it fails: Pod is removed from the Service (no traffic sent). Pod keeps running, probe keeps checking. One success = traffic resumes.
An app can be alive but not ready. Example: your web server is running (liveness passes) but the database it needs is down (readiness fails). Kubernetes stops sending traffic but doesn't restart it -- because the container itself is fine.
initialDelaySecondsWait N seconds before the first check (give the app time to boot)periodSecondsCheck every N secondstimeoutSecondsIf no response in N seconds, that check failssuccessThresholdN consecutive passes = healthyfailureThresholdN consecutive failures = action is takenWhen liveness fails repeatedly, Kubernetes restarts the container with increasing delay -- like a snooze alarm that waits longer each time. This state is called CrashLoopBackOff.
Backoff delay between restarts (the 5-minute cap is hardcoded -- you can't change it):
Kubernetes retries forever. To fix it: find the root cause, fix the code or config, then delete the pod or rollback the deployment.
kubectl rollout status deploy/<name>Watch update progress in real timekubectl rollout history deploy/<name>View revision historykubectl rollout undo deploy/<name>Rollback to previous versionkubectl rollout undo deploy/<name> --to-revision=1Rollback to a specific revisionkubectl rollout pause deploy/<name>Pause a rolling update mid-wayThe stable address book that routes traffic to your ever-changing pods
Imagine a phone directory for a company. Employees come and go, switch desks, get new extensions. But the directory number (e.g., "Sales: 555-0100") never changes. Callers always reach the right team. A Service is that directory for your pods.
Pods get new IP addresses every time they restart. If another app hardcodes that IP, it breaks. Services give pods a stable address that never changes.
Each type builds on the previous one, like nesting dolls.
Gets a stable virtual IP inside the cluster. Other pods use this IP or DNS name to reach the service. Not accessible from outside the cluster.
Pod A -> ClusterIP (172.20.x.x) -> Pod B
DNS: service-name.namespace.svc.cluster.local
| Feature | ClusterIP | NodePort | LoadBalancer |
|---|---|---|---|
| Access | Internal only | External via node IP:port | External via single LB address |
| Port Range | Any | 30000-32767 | Any (LB handles it) |
| Load Balances Pods? | Yes (round-robin) | Yes | Yes |
| Load Balances Nodes? | N/A | No | Yes |
| Cloud Required? | No | No | Yes |
| Builds On | -- | ClusterIP | NodePort + ClusterIP |
| Use Case | Internal microservices | Dev/testing | Production external |
Watch how traffic flows through a LoadBalancer service to reach a pod.
See every concept in action with a real project you can deploy
KubeBot is an interactive chatbot that teaches Kubernetes concepts. It's built with Node.js, runs in a Docker container, and deploys to Kubernetes. The project itself demonstrates the concepts it teaches.
FROM node:18-alpine
WORKDIR /app
COPY src/app.js .
EXPOSE 3000
CMD ["node", "app.js"]
Start from a tiny Linux image with Node.js 18 pre-installed (~50MB)
Set /app as the working directory inside the container
Copy our app code from the host into the container
Tell Docker this container listens on port 3000
When the container starts, run "node app.js"
This is the actual deployment.yml from the project. Notice how it uses everything we learned: replicas, rolling updates, and probes.
apiVersion: apps/v1
kind: Deployment
metadata:
name: kubebot
spec:
replicas: 3
selector:
matchLabels:
app: kubebot
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
Use the apps API group
Create a Deployment (wraps a ReplicaSet)
Name it "kubebot"
The rules:
Always keep 3 copies running (high availability!)
Find our pods by the label "app: kubebot"
When updating, do it gradually:
Max 1 extra pod during updates (25% of 3 ~ 1)
Max 1 pod can be down during updates
The deployment YAML points probes at /health and /ready. Here's the actual code that handles those checks:
// Health endpoint (liveness probe)
if (req.url === '/health') {
res.writeHead(200, {'Content-Type': 'application/json'});
return res.end(JSON.stringify({status: 'UP'}));
}
// Ready endpoint (readiness probe)
if (req.url === '/ready') {
res.writeHead(200, {'Content-Type': 'application/json'});
return res.end(JSON.stringify({status: 'READY'}));
}
When Kubernetes asks "are you alive?":
If someone visits /health...
Send back a 200 (success) response
With the message {"status": "UP"}
When Kubernetes asks "are you ready for traffic?":
If someone visits /ready...
Send back a 200 response
With {"status": "READY"}
The deployment.yml says httpGet: path: /health. Kubernetes periodically hits that URL. The app.js code above responds with status 200. Kubernetes sees 200 and says "all good." Any other status (or no response) = probe failure.
The project has two services -- one for internal access, one for external.
apiVersion: v1
kind: Service
metadata:
name: kubebot-internal
spec:
type: ClusterIP
selector:
app: kubebot
ports:
- port: 80
targetPort: 3000
Core Kubernetes API
Create a Service
Call it "kubebot-internal"
Internal-only (ClusterIP is the default)
Route traffic to pods labeled "app: kubebot"
Listen on port 80 (standard HTTP)
Forward to port 3000 on the pod (where app.js listens)
apiVersion: v1
kind: Service
metadata:
name: kubebot-nodeport
spec:
type: NodePort
selector:
app: kubebot
ports:
- port: 80
targetPort: 3000
nodePort: 30080
Core Kubernetes API
Create a Service
Call it "kubebot-nodeport"
Expose externally via NodePort
Route to pods labeled "app: kubebot"
Internal port 80
Pod listens on 3000
External access on port 30080 (any node's IP:30080 works!)
docker build -t kubebot:v1 .
kubectl apply -f k8-applications/
kubectl get pods -- you should see 3 pods running
kubectl port-forward svc/kubebot-internal 3000:80 then open localhost:3000
Type "pod", "service", "deployment", or "help" in the chat
You now understand Kubernetes architecture, pods, ReplicaSets, deployments with rolling updates and probes, and all three service types. Clone the KubeBot project, deploy it, and keep experimenting.
Try scaling KubeBot to 5 replicas, triggering a rolling update with a new image tag, or deleting a pod and watching Kubernetes recreate it. The best way to learn is to break things on purpose.