G
GuideDevOps
Lesson 19 of 28

Load Balancers: Types & Algorithms

Part of the Networking Basics tutorial series.

Load balancers distribute incoming traffic across multiple servers, enabling horizontal scaling, high availability, and performance optimization. They're essential for any production infrastructure.

What is a Load Balancer?

Load Balancer = Device/service that distributes traffic:

Users (requests)
    ↓
[Load Balancer - Decides where to send traffic]
    ┌─────┬─────────┬─────┐
    ↓     ↓         ↓     ↓
 Server1 Server2 Server3 Server4
   ↑     ↑         ↑     ↑
└─────┴─────────┴─────┘
    ↓
[Responses aggregated]
    ↓
  Users

Benefits:

  • Scalability — Add more servers without single point of failure
  • High Availability — If one server fails, traffic goes to others
  • Performance — Distribute load fairly
  • Session Persistence — Keep user on same server

Load Balancer Levels

Layer 4 (Transport) - L4LB:

  • Based on: TCP/UDP, source/dest IP, source/dest port
  • Speed: Very fast (hardware level)
  • Intelligence: Low
  • Example: nginx, HAProxy (TCP mode), AWS Network LB
  • Use when: Simple TCP/UDP load balancing

Layer 7 (Application) - L7LB:

  • Based on: HTTP headers, URL paths, hostnames, cookies
  • Speed: Slower (inspects full requests)
  • Intelligence: High
  • Example: nginx, HAProxy (HTTP mode), AWS Application LB
  • Use when: Content-based routing

Load Balancing Algorithms

AlgorithmBehaviorWhen to Use
Round-RobinEach server in turnEqual capacity servers
WeightedAssign weight to eachDifferent server sizes
Least ConnectionsFewest active connectionsVariable request duration
IP HashSame IP → same serverSession persistence
RandomRandom selectionSimple, distributed
Least Response TimeFastest responding serverPerformance critical

Round-Robin Example:

Request 1 → Server 1
Request 2 → Server 2
Request 3 → Server 3
Request 4 → Server 1  (cycle repeats)

Weighted Example:

Server 1 (weight 2): 50% of traffic
Server 2 (weight 1): 25% of traffic
Server 3 (weight 1): 25% of traffic

Request 1,2 → Server 1
Request 3   → Server 2
Request 4   → Server 3
Request 5,6 → Server 1

Load Balancer Configuration

nginx - Layer 7 Load Balancer:

upstream backend {
    # Round-robin by default
    server 192.168.1.101:8080;
    server 192.168.1.102:8080;
    server 192.168.1.103:8080;
    
    # Alternative: weighted
    # server 192.168.1.101:8080 weight=2;
    # server 192.168.1.102:8080 weight=1;
    
    # Alternative: least connections
    # least_conn;
}
 
server {
    listen 80;
    server_name api.example.com;
    
    client_max_body_size 100M;
    
    location / {
        proxy_pass http://backend;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        
        # Connection optimization
        proxy_http_version 1.1;
        proxy_set_header Connection "";
    }
    
    # Health check endpoint
    location /health {
        access_log off;
        return 200 "healthy\n";
    }
}

HAProxy - L4/L7 Load Balancer:

frontend web_frontend
    bind *:80
    mode http
    default_backend web_backend

backend web_backend
    mode http
    balance roundrobin  # or: leastconn, random, source
    
    # Define backend servers
    server web1 192.168.1.101:8080 check
    server web2 192.168.1.102:8080 check
    server web3 192.168.1.103:8080 check
    
    # Health checks
    option httpchk GET /health
    http-check expect status 200

Health Checks

Health Checks:

When servers are down or healthy:

  • Healthy server: LB checks → HTTP 200 OK ✓ → Include in pool
  • Failed server: LB checks → HTTP 500 Error ✗ → Remove from pool
  • After recovery: LB checks → HTTP 200 OK ✓ → Add back to pool

Types of Health Checks:

HTTP GET /health
  ├─ Status 200 = healthy
  └─ Status != 200 = unhealthy
 
TCP Connection
  ├─ Can connect = healthy
  └─ Connection refused = unhealthy
 
Custom Script
  ├─ Exit code 0 = healthy
  └─ Exit code != 0 = unhealthy

nginx Health Check:

upstream backend {
    server 192.168.1.101:8080 max_fails=3 fail_timeout=30s;
    server 192.168.1.102:8080 max_fails=3 fail_timeout=30s;
}

Means: If 3 failures in 30 seconds, mark unhealthy for 30s.

Session Persistence

Session Persistence Problem:

Server 1: "Your cart: [item A]"
Server 2: "Your cart: [empty]"  ← Lost session!

Solution: Sticky Sessions (IP Hash):

upstream backend {
    hash $remote_addr consistent;
    server 192.168.1.101:8080;
    server 192.168.1.102:8080;
    server 192.168.1.103:8080;
}

Now: Same client IP → Always same server

Alternative: Shared Session Store:

Server 1 → Write session to Redis
Server 2 → Read session from Redis

Any server can handle user requests

SSL/TLS Termination

Client-LB-Server Flow:

Client: HTTPS request (encrypted)
    ↓
[Load Balancer] (SSL termination point)
  ├─ Decrypt using certificate
  ├─ Route to backend
    ↓
Server: Receive unencrypted HTTP

nginx SSL Termination:

server {
    listen 443 ssl http2;
    ssl_certificate /etc/ssl/certs/server.crt;
    ssl_certificate_key /etc/ssl/private/server.key;
    
    location / {
        # Connection to backend is HTTP (unencrypted)
        proxy_pass http://backend;
        proxy_set_header X-Forwarded-Proto $scheme;
    }
}

benefit: Heavy encryption work done at LB, backends focus on app logic

Cloud Load Balancers

AWS Elastic Load Balancer (ELB):

# Create load balancer
aws elb create-load-balancer \
  --load-balancer-name my-lb \
  --listeners Protocol=HTTP,LoadBalancerPort=80,InstancePort=8080
 
# Register instances
aws elb register-instances-with-load-balancer \
  --load-balancer-name my-lb \
  --instances i-1234567890abcdef0 i-0987654321fedcba0
 
# Configure health check
aws elb configure-health-check \
  --load-balancer-name my-lb \
  --health-check Target=HTTP:8080/health,Interval=30,Timeout=5,HealthyThreshold=2,UnhealthyThreshold=2

GCP Load Balancer:

# Create instance group
gcloud compute instance-groups create web-servers \
  --zone us-central1-a \
  --size 3
 
# Create health check
gcloud compute health-checks create http web-health \
  --port 8080 \
  --request-path /health
 
# Create load balancer
gcloud compute forwarding-rules create web-lb \
  --load-balancing-scheme EXTERNAL \
  --target-http-proxy web-proxy

Kubernetes Service LoadBalancer:

apiVersion: v1
kind: Service
metadata:
  name: my-service
spec:
  type: LoadBalancer
  selector:
    app: myapp
  ports:
  - port: 80
    targetPort: 8080

Kubernetes automatically:

  • Creates load balancer
  • Registers pod IPs
  • Manages traffic distribution
  • Performs health checks

Global Load Balancing

Global Load Balancing Purpose:

US User → Route to US datacenter
EU User → Route to EU datacenter
Asia User → Route to Asia datacenter

Methods:

  1. GeoDNS — DNS resolves to different IPs by location
user1.example.com (US IP) → 203.0.113.1
user2.example.com (EU IP) → 198.51.100.1
  1. Anycast — Multiple datacenters have same IP
BGP routes user to nearest datacenter

Troubleshooting Load Balancer

"Traffic not distributed evenly"

Check algorithm:
nginx: hash $remote_addr too sticky?
       Try: least_conn instead

Check server weights:
aws elb: All instances same weight?
       Check instance health

Check health checks:
Failing? Would remove that server

"Sessions lost after LB restart"

Using sticky sessions?
→ IP → Server1 mapping lost
→ Same user → Server2
→ Session lost

Solution:
Move to shared session store (Redis)

Key Concepts

  • Load Balancer = Distributes traffic across servers
  • L4 LB = TCP/UDP based, fast
  • L7 LB = HTTP based, smart routing
  • Round-Robin = Equal distribution
  • Weighted = Proportional to server capacity
  • Health Check = Detect failed servers
  • Session Persistence = Keep user on same server
  • SSL Termination = Decrypt at LB, not backend
  • Affinity = Route same client to same server
  • Use load balancers for high availability