Glossary

Load Balancer

A load balancer distributes incoming network traffic across multiple servers, preventing any single server from being overwhelmed, enabling horizontal scaling, and providing automatic failover when a server goes down.

Explanation

A load balancer sits between clients and a server fleet. When a request arrives, the load balancer picks a backend server using a routing algorithm and forwards the request. From the client's perspective, there is one server (the load balancer's address); from each server's perspective, requests come from the load balancer. Load balancing algorithms: round robin (rotate through servers in order), least connections (route to the server with fewest active connections), IP hash (route same client IP to same server — sticky behavior), and weighted (some servers handle more traffic). The right algorithm depends on workload: stateless APIs work well with round robin; compute-heavy workloads benefit from least connections. Layer 4 vs Layer 7 load balancing: Layer 4 (transport layer) routes based on TCP/UDP port and IP — fast, simple, can't inspect content. Layer 7 (application layer) inspects HTTP content — routes based on URL path, host header, or cookie values. A Layer 7 load balancer can route /api/* to API servers and /* to web servers. AWS ALB is Layer 7; AWS NLB is Layer 4. Sticky sessions (session affinity) route all requests from the same client to the same server — required for stateful apps that store session in server memory. But sticky sessions reduce load balancing effectiveness. The better solution is externalizing session state to Redis, making the app stateless and enabling true round-robin distribution.

Code Example

bash

# nginx Layer 7 load balancer configuration
upstream api_servers {
    server api-1.internal:3000 max_fails=3 fail_timeout=30s;
    server api-2.internal:3000 max_fails=3 fail_timeout=30s;
    server api-3.internal:3000 max_fails=3 fail_timeout=30s;
}

server {
    listen 80;
    location /api/ {
        proxy_pass http://api_servers;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }
}

# AWS ALB via Terraform
# resource "aws_lb" "api" {
#   load_balancer_type = "application"  # Layer 7
#   subnets = var.public_subnet_ids
# }
# resource "aws_lb_target_group" "api" {
#   port     = 3000
#   protocol = "HTTP"
#   health_check { path = "/health" }
# }

Why It Matters for Engineers

Load balancers are the gateway to horizontal scaling. When system design interview questions ask 'scale to 10 million users,' load balancers are the first answer. They're how you go from one server to many without changing client code. Understanding load balancers also explains zero-downtime deployments: take a server out of rotation (stop sending traffic to it), deploy, run health checks, add it back. Without this understanding, deployments either cause downtime or require risky big-bang approaches.

Related Terms

CDN · nginx · Docker · Microservices

Learn This In Practice

Go deeper with the full module on Beyond Vibe Code.

DevOps Fundamentals → →