PM2 Clustering and Performance Optimization on Ubuntu → Explore with me!

This entry is part 3 of 7 in the series PM2 Mastery: From Zero to Production Hero

PM2 Mastery: From Zero to Production Hero

Node.js applications are single-threaded by nature, but modern servers have multiple CPU cores. PM2’s clustering feature bridges this gap, allowing you to leverage all available computing power while providing built-in load balancing and fault tolerance. This comprehensive guide will transform your understanding of PM2 performance optimization.

Understanding PM2 Clustering Fundamentals

PM2 clustering creates multiple instances of your application, each running in its own process. The master process manages these worker processes and distributes incoming requests among them using a round-robin load balancer.

graph TD
    A[Incoming Requests] --> B[PM2 Master Process]
    B --> C[Load Balancer]
    
    C --> D[Worker 1Port 3000]
    C --> E[Worker 2Port 3000]
    C --> F[Worker 3Port 3000]
    C --> G[Worker 4Port 3000]
    
    D --> H[CPU Core 1]
    E --> I[CPU Core 2]
    F --> J[CPU Core 3]
    G --> K[CPU Core 4]
    
    L[Process Monitor] --> B
    L --> D
    L --> E
    L --> F
    L --> G

Single vs Cluster Mode Comparison

# Single instance (fork mode)
pm2 start app.js --name "single-app"

# Cluster mode with 4 instances
pm2 start app.js --name "cluster-app" -i 4

# Cluster mode using all CPU cores
pm2 start app.js --name "max-cluster" -i max

# Check the difference
pm2 list

graph TD
    A[Fork Mode] --> B[Single Process]
    A --> C[One CPU Core]
    A --> D[No Load Balancing]
    A --> E[Single Point of Failure]

graph TD
   
    
    F[Cluster Mode] --> G[Multiple Processes]
    F --> H[All CPU Cores]
    F --> I[Built-in Load Balancer]
    F --> J[High Availability]

CPU Optimization Strategies

Optimizing CPU utilization is crucial for maximizing your application’s performance. Here’s how to determine the optimal number of instances:

Determining Optimal Instance Count

# Check your server's CPU information
lscpu
nproc
cat /proc/cpuinfo | grep processor | wc -l

# Monitor CPU usage during different loads
htop
top
iostat -c 1

# Test different instance configurations
pm2 start app.js -i 1 --name "test-1"
pm2 start app.js -i 2 --name "test-2"
pm2 start app.js -i 4 --name "test-4"
pm2 start app.js -i max --name "test-max"

# Monitor performance
pm2 monit

Dynamic Instance Scaling

# Scale up instances dynamically
pm2 scale myapp +3

# Scale down instances
pm2 scale myapp -2

# Set specific number of instances
pm2 scale myapp 6

# Reset to maximum CPU cores
pm2 scale myapp max

# Automated scaling based on CPU usage (ecosystem.config.js)
module.exports = {
  apps: [{
    name: 'auto-scale-app',
    script: 'app.js',
    instances: 'max',
    exec_mode: 'cluster',
    
    // Auto-scaling configuration
    min_uptime: '10s',
    max_restarts: 10,
    max_memory_restart: '1G',
    
    // Performance monitoring
    pmx: true,
    automation: false,
    
    env_production: {
      NODE_ENV: 'production',
      // V8 optimization flags
      node_args: '--max-old-space-size=2048 --optimize-for-size'
    }
  }]
};

flowchart TD
    A[Monitor CPU Usage] --> B{CPU > 80%?}
    B -->|Yes| C[Scale Up Instances]
    B -->|No| D{CPU < 30%?}
    D -->|Yes| E[Scale Down Instances]
    D -->|No| F[Maintain Current Scale]
    
    C --> G[pm2 scale app +2]
    E --> H[pm2 scale app -1]
    F --> I[Continue Monitoring]
    
    G --> I
    H --> I
    I --> A

Load Balancing Deep Dive

PM2’s built-in load balancer distributes requests across your application instances. Understanding how it works helps you optimize for your specific use case:

Load Balancing Algorithms

// ecosystem.config.js - Load balancing configuration
module.exports = {
  apps: [{
    name: 'load-balanced-app',
    script: 'server.js',
    instances: 4,
    exec_mode: 'cluster',
    
    // Load balancing settings
    instance_var: 'INSTANCE_ID', // Pass instance ID to app
    
    env_production: {
      NODE_ENV: 'production',
      PORT: 3000,
      
      // Enable session affinity for sticky sessions
      SESSION_AFFINITY: 'true',
      
      // Load balancer configuration
      LB_METHOD: 'round_robin' // Default PM2 method
    }
  }]
};

Testing Load Distribution

# Create a test application that shows which instance handles requests
cat > load-test-app.js << 'EOF'
const express = require('express');
const app = express();
const port = process.env.PORT || 3000;

// Add instance information to responses
app.use((req, res, next) => {
  res.instanceId = process.env.INSTANCE_ID || process.pid;
  res.instanceUptime = process.uptime();
  next();
});

app.get('/', (req, res) => {
  res.json({
    message: 'Load balancer test',
    instanceId: res.instanceId,
    processId: process.pid,
    uptime: res.instanceUptime,
    timestamp: new Date().toISOString(),
    memoryUsage: process.memoryUsage()
  });
});

app.get('/cpu-intensive', (req, res) => {
  const start = Date.now();
  // Simulate CPU-intensive task
  let result = 0;
  for (let i = 0; i < 1000000; i++) {
    result += Math.random();
  }
  
  res.json({
    instanceId: res.instanceId,
    processId: process.pid,
    result: result,
    processingTime: Date.now() - start,
    timestamp: new Date().toISOString()
  });
});

app.listen(port, () => {
  console.log(`Instance ${process.pid} listening on port ${port}`);
});
EOF

# Start with clustering
pm2 start load-test-app.js -i 4 --name "load-test"

# Test load distribution
for i in {1..10}; do
  curl -s http://localhost:3000/ | jq '.instanceId, .processId'
done

sequenceDiagram
    participant C as Client
    participant LB as PM2 Load Balancer
    participant W1 as Worker 1 (PID: 1234)
    participant W2 as Worker 2 (PID: 1235)
    participant W3 as Worker 3 (PID: 1236)
    participant W4 as Worker 4 (PID: 1237)
    
    C->>LB: Request 1
    LB->>W1: Forward to Worker 1
    W1-->>C: Response (PID: 1234)
    
    C->>LB: Request 2
    LB->>W2: Forward to Worker 2
    W2-->>C: Response (PID: 1235)
    
    C->>LB: Request 3
    LB->>W3: Forward to Worker 3
    W3-->>C: Response (PID: 1236)
    
    C->>LB: Request 4
    LB->>W4: Forward to Worker 4
    W4-->>C: Response (PID: 1237)
    
    C->>LB: Request 5
    LB->>W1: Forward to Worker 1 (Round Robin)
    W1-->>C: Response (PID: 1234)

Memory Management and Leak Prevention

Memory leaks can cripple your application’s performance. PM2 provides several mechanisms to detect and handle memory issues:

Memory Monitoring Configuration

// ecosystem.config.js - Memory management
module.exports = {
  apps: [{
    name: 'memory-managed-app',
    script: 'app.js',
    instances: 'max',
    exec_mode: 'cluster',
    
    // Memory management settings
    max_memory_restart: '1G',        // Restart if memory exceeds 1GB
    min_uptime: '30s',               // Min uptime before considering restart
    max_restarts: 5,                 // Max restarts within min_uptime
    restart_delay: 4000,             // Delay between restarts
    
    // V8 memory optimization
    node_args: [
      '--max-old-space-size=2048',   // Increase heap size to 2GB
      '--optimize-for-size',         // Optimize for memory usage
      '--gc-interval=100',           // Garbage collection interval
      '--expose-gc'                  // Expose garbage collection
    ].join(' '),
    
    env_production: {
      NODE_ENV: 'production',
      
      // Memory profiling
      NODE_OPTIONS: '--inspect',
      
      // Enable heap snapshots
      HEAP_SNAPSHOTS: 'true'
    }
  }]
};

Memory Monitoring Commands

# Monitor memory usage in real-time
pm2 monit

# Get detailed memory information
pm2 show myapp

# List processes with memory usage
pm2 list

# Memory usage over time
watch -n 1 'pm2 list'

# Enable memory profiling
pm2 start app.js --node-args="--inspect --expose-gc" --name "profiled-app"

# Generate heap snapshot
pm2 exec myapp -- node -e "
  if (global.gc) {
    global.gc();
    console.log('Memory after GC:', process.memoryUsage());
  }
"

# Check for memory leaks using custom endpoint
cat > memory-monitor.js << 'EOF'
const express = require('express');
const app = express();

app.get('/memory', (req, res) => {
  const usage = process.memoryUsage();
  const formatBytes = (bytes) => (bytes / 1024 / 1024).toFixed(2) + ' MB';
  
  res.json({
    rss: formatBytes(usage.rss),           // Resident Set Size
    heapTotal: formatBytes(usage.heapTotal), // Total heap size
    heapUsed: formatBytes(usage.heapUsed),   // Used heap size
    external: formatBytes(usage.external),   // External memory
    arrayBuffers: formatBytes(usage.arrayBuffers),
    uptime: process.uptime()
  });
});

app.get('/gc', (req, res) => {
  if (global.gc) {
    const before = process.memoryUsage();
    global.gc();
    const after = process.memoryUsage();
    
    res.json({
      before: {
        heapUsed: (before.heapUsed / 1024 / 1024).toFixed(2) + ' MB'
      },
      after: {
        heapUsed: (after.heapUsed / 1024 / 1024).toFixed(2) + ' MB'
      },
      freed: ((before.heapUsed - after.heapUsed) / 1024 / 1024).toFixed(2) + ' MB'
    });
  } else {
    res.json({ error: 'GC not exposed. Start with --expose-gc' });
  }
});

app.listen(3000, () => {
  console.log('Memory monitor running on port 3000');
});
EOF

graph TD
    A[Application Start] --> B[Monitor Memory Usage]
    B --> C{Memory > Threshold?}
    C -->|No| D[Continue Normal Operation]
    C -->|Yes| E[Log Memory Warning]
    E --> F{Memory > Max Limit?}
    F -->|No| G[Trigger Garbage Collection]
    F -->|Yes| H[PM2 Restart Process]
    
    G --> I[Check Memory After GC]
    I --> J{Memory Still High?}
    J -->|Yes| H
    J -->|No| D
    
    H --> K[New Process Instance]
    K --> B
    D --> B

Performance Tuning Techniques

Beyond clustering and memory management, several advanced techniques can significantly boost your application’s performance:

V8 Engine Optimization

// ecosystem.config.js - V8 optimization
module.exports = {
  apps: [{
    name: 'optimized-app',
    script: 'app.js',
    instances: 'max',
    exec_mode: 'cluster',
    
    // V8 optimization flags
    node_args: [
      // Memory optimization
      '--max-old-space-size=4096',      // 4GB heap limit
      '--max-new-space-size=2048',      // New generation space
      '--max-semi-space-size=512',      // Semi space size
      
      // Performance optimization
      '--optimize-for-size',            // Optimize for memory over speed
      '--always-compact',               // Always perform compaction
      '--expose-gc',                    // Enable manual GC
      
      // Advanced V8 flags
      '--turbo-inline-api-calls',       // Inline API calls
      '--use-osr',                      // On-stack replacement
      '--trace-opt',                    // Trace optimizations (debug only)
      '--trace-deopt'                   // Trace deoptimizations (debug only)
    ].join(' '),
    
    env_production: {
      NODE_ENV: 'production',
      UV_THREADPOOL_SIZE: 16,           // Increase thread pool size
      NODE_OPTIONS: '--enable-source-maps'
    }
  }]
};

Connection and Request Optimization

// High-performance server configuration
const express = require('express');
const compression = require('compression');
const helmet = require('helmet');
const app = express();

// Enable compression
app.use(compression({
  level: 6,  // Compression level (0-9)
  threshold: 1024,  // Only compress responses > 1KB
  filter: (req, res) => {
    if (req.headers['x-no-compression']) {
      return false;
    }
    return compression.filter(req, res);
  }
}));

// Security headers
app.use(helmet());

// Keep-alive configuration
app.use((req, res, next) => {
  res.setHeader('Connection', 'keep-alive');
  res.setHeader('Keep-Alive', 'timeout=65');
  next();
});

// Performance monitoring middleware
app.use((req, res, next) => {
  const start = process.hrtime.bigint();
  
  res.on('finish', () => {
    const end = process.hrtime.bigint();
    const duration = Number(end - start) / 1000000; // Convert to milliseconds
    
    console.log(`${req.method} ${req.path} - ${res.statusCode} - ${duration.toFixed(2)}ms`);
    
    // Log slow requests
    if (duration > 1000) {
      console.warn(`Slow request detected: ${req.method} ${req.path} took ${duration.toFixed(2)}ms`);
    }
  });
  
  next();
});

// Cluster-aware session handling
if (process.env.NODE_ENV === 'production') {
  const session = require('express-session');
  const RedisStore = require('connect-redis')(session);
  const redis = require('redis');
  const client = redis.createClient();
  
  app.use(session({
    store: new RedisStore({ client }),
    secret: process.env.SESSION_SECRET,
    resave: false,
    saveUninitialized: false,
    cookie: {
      secure: false, // Set to true with HTTPS
      maxAge: 24 * 60 * 60 * 1000 // 24 hours
    }
  }));
}

// Health check endpoint
app.get('/health', (req, res) => {
  const memUsage = process.memoryUsage();
  res.json({
    status: 'healthy',
    uptime: process.uptime(),
    memory: {
      rss: Math.round(memUsage.rss / 1024 / 1024) + ' MB',
      heapUsed: Math.round(memUsage.heapUsed / 1024 / 1024) + ' MB',
      heapTotal: Math.round(memUsage.heapTotal / 1024 / 1024) + ' MB'
    },
    processId: process.pid,
    nodeVersion: process.version
  });
});

const port = process.env.PORT || 3000;
const server = app.listen(port, () => {
  console.log(`Optimized server running on port ${port} (PID: ${process.pid})`);
});

// Optimize server settings
server.keepAliveTimeout = 65000;
server.headersTimeout = 66000;
server.timeout = 120000;

Performance Benchmarking

Measuring performance is crucial for optimization. Here’s how to benchmark your PM2 applications:

Load Testing Setup

# Install load testing tools
npm install -g autocannon
npm install -g loadtest

# Basic load test with autocannon
autocannon -c 100 -d 60 http://localhost:3000

# Advanced load test
autocannon \
  --connections 200 \
  --duration 120 \
  --pipelining 10 \
  --method GET \
  --headers "User-Agent=LoadTest" \
  http://localhost:3000

# Test different instance configurations
echo "Testing single instance..."
pm2 delete all
pm2 start app.js --name "single" -i 1
sleep 5
autocannon -c 100 -d 30 http://localhost:3000 > single-instance.txt

echo "Testing cluster mode..."
pm2 delete all
pm2 start app.js --name "cluster" -i max
sleep 5
autocannon -c 100 -d 30 http://localhost:3000 > cluster-mode.txt

# Compare results
echo "=== Single Instance Results ==="
cat single-instance.txt | grep -E "(requests|latency|throughput)"

echo "=== Cluster Mode Results ==="
cat cluster-mode.txt | grep -E "(requests|latency|throughput)"

Performance Monitoring Script

#!/bin/bash
# performance-monitor.sh

APP_NAME="myapp"
DURATION=300  # 5 minutes
LOG_FILE="performance-$(date +%Y%m%d_%H%M%S).log"

echo "Starting performance monitoring for $APP_NAME" | tee $LOG_FILE
echo "Duration: $DURATION seconds" | tee -a $LOG_FILE
echo "Timestamp: $(date)" | tee -a $LOG_FILE
echo "================================" | tee -a $LOG_FILE

# Function to get PM2 stats
get_pm2_stats() {
  pm2 jlist | jq -r '
    .[] | select(.name == "'$APP_NAME'") | 
    "Instance: \(.pm_id) | Status: \(.pm2_env.status) | Memory: \(.memory) | CPU: \(.cpu)%"
  '
}

# Function to get system stats
get_system_stats() {
  echo "CPU: $(top -bn1 | grep "Cpu(s)" | awk '{print $2}' | cut -d'%' -f1)%"
  echo "Memory: $(free -m | awk 'NR==2{printf "%.1f%%", $3*100/$2}')"
  echo "Load: $(uptime | awk -F'load average:' '{print $2}')"
}

# Start monitoring
for i in $(seq 1 $((DURATION/10))); do
  echo "=== Measurement $i ($(date)) ===" | tee -a $LOG_FILE
  get_pm2_stats | tee -a $LOG_FILE
  get_system_stats | tee -a $LOG_FILE
  echo "" | tee -a $LOG_FILE
  sleep 10
done

echo "Performance monitoring completed. Results saved to $LOG_FILE"

graph TB
    A[Start Load Test] --> B[Single Instance Test]
    B --> C[Record Metrics]
    C --> D[Cluster Mode Test]
    D --> E[Record Metrics]
    E --> F[Compare Results]
    
    F --> G{Performance Improved?}
    G -->|Yes| H[Deploy Cluster Config]
    G -->|No| I[Analyze Bottlenecks]
    
    I --> J[CPU Bound?]
    I --> K[Memory Bound?]
    I --> L[I/O Bound?]
    
    J --> M[Increase Instances]
    K --> N[Optimize Memory Usage]
    L --> O[Optimize Database/Network]
    
    M --> P[Re-test]
    N --> P
    O --> P
    P --> F

Resource Limits and Health Monitoring

Setting appropriate resource limits and implementing health checks ensures your applications remain stable under load:

Comprehensive Health Monitoring

// ecosystem.config.js - Health monitoring
module.exports = {
  apps: [{
    name: 'health-monitored-app',
    script: 'app.js',
    instances: 'max',
    exec_mode: 'cluster',
    
    // Resource limits
    max_memory_restart: '1G',
    max_restarts: 5,
    min_uptime: '30s',
    restart_delay: 4000,
    
    // Health monitoring
    health_check_url: 'http://localhost:3000/health',
    health_check_interval: 30000,  // 30 seconds
    health_check_timeout: 5000,    // 5 seconds
    
    // Advanced monitoring
    pmx: true,
    automation: false,
    
    env_production: {
      NODE_ENV: 'production',
      HEALTH_CHECK_ENABLED: 'true',
      MONITORING_INTERVAL: '10000'
    }
  }]
};

Custom Health Check Implementation

// health-monitor.js
const express = require('express');
const app = express();

// Health check state
let healthStatus = {
  status: 'healthy',
  lastCheck: new Date(),
  checks: {
    database: 'unknown',
    redis: 'unknown',
    external_api: 'unknown'
  },
  metrics: {
    uptime: 0,
    memory: {},
    requestCount: 0,
    errorCount: 0
  }
};

// Middleware to count requests and errors
app.use((req, res, next) => {
  healthStatus.metrics.requestCount++;
  
  res.on('finish', () => {
    if (res.statusCode >= 400) {
      healthStatus.metrics.errorCount++;
    }
  });
  
  next();
});

// Comprehensive health check
app.get('/health', async (req, res) => {
  const startTime = Date.now();
  
  try {
    // Update metrics
    healthStatus.metrics.uptime = process.uptime();
    healthStatus.metrics.memory = process.memoryUsage();
    healthStatus.lastCheck = new Date();
    
    // Check database connection
    healthStatus.checks.database = await checkDatabase();
    
    // Check Redis connection
    healthStatus.checks.redis = await checkRedis();
    
    // Check external API
    healthStatus.checks.external_api = await checkExternalAPI();
    
    // Determine overall status
    const allHealthy = Object.values(healthStatus.checks).every(status => status === 'healthy');
    healthStatus.status = allHealthy ? 'healthy' : 'degraded';
    
    const responseTime = Date.now() - startTime;
    
    res.status(healthStatus.status === 'healthy' ? 200 : 503).json({
      ...healthStatus,
      responseTime: responseTime + 'ms',
      processId: process.pid,
      nodeVersion: process.version
    });
    
  } catch (error) {
    healthStatus.status = 'unhealthy';
    res.status(503).json({
      status: 'unhealthy',
      error: error.message,
      processId: process.pid,
      timestamp: new Date()
    });
  }
});

// Individual health check functions
async function checkDatabase() {
  try {
    // Implement your database check here
    // Example: await db.ping();
    return 'healthy';
  } catch (error) {
    console.error('Database health check failed:', error);
    return 'unhealthy';
  }
}

async function checkRedis() {
  try {
    // Implement your Redis check here
    // Example: await redis.ping();
    return 'healthy';
  } catch (error) {
    console.error('Redis health check failed:', error);
    return 'unhealthy';
  }
}

async function checkExternalAPI() {
  try {
    // Implement external API check here
    // Example: await fetch('https://api.example.com/health');
    return 'healthy';
  } catch (error) {
    console.error('External API health check failed:', error);
    return 'unhealthy';
  }
}

// Graceful shutdown handling
process.on('SIGTERM', () => {
  console.log('Received SIGTERM, shutting down gracefully');
  healthStatus.status = 'shutting_down';
  
  server.close(() => {
    console.log('Process terminated');
    process.exit(0);
  });
});

const port = process.env.PORT || 3000;
const server = app.listen(port, () => {
  console.log(`Health monitored server running on port ${port} (PID: ${process.pid})`);
});

What’s Next?

You now have comprehensive knowledge of PM2 clustering and performance optimization. You can leverage multiple CPU cores, implement load balancing, manage memory effectively, and monitor application health. In the next part of this series, we’ll explore production-ready deployment with systemd integration:

Deep dive into systemd service management
PM2 startup and auto-restart configuration
Security hardening and user management
Production deployment best practices
Service monitoring and maintenance

The performance optimization techniques you’ve learned will be essential for the production deployment strategies we’ll cover next.

Series Navigation:
← Part 2: Configuration Mastery
→ Part 3: Clustering and Performance (You are here)
→ Part 4: Production Systemd Integration (Coming next)

Navigate<< PM2 Configuration Mastery: Ecosystem Files and Environment ManagementProduction-Ready PM2: Systemd Integration and Auto-Startup >>

PM2 Clustering and Performance Optimization on Ubuntu

Understanding PM2 Clustering Fundamentals

Single vs Cluster Mode Comparison

CPU Optimization Strategies

Determining Optimal Instance Count

Dynamic Instance Scaling

Load Balancing Deep Dive

Load Balancing Algorithms

Testing Load Distribution

Memory Management and Leak Prevention

Memory Monitoring Configuration

Memory Monitoring Commands

Performance Tuning Techniques

V8 Engine Optimization

Connection and Request Optimization

Performance Benchmarking

Load Testing Setup

Performance Monitoring Script

Resource Limits and Health Monitoring

Comprehensive Health Monitoring

Custom Health Check Implementation

What’s Next?

Like this:

You may like

Written by:

Chandan 439 Posts

You May Have Missed

Letter to My Younger Self: You Don’t Have to Work Nights and Weekends

Letter to My Younger Self: It’s Okay to Say No

Letter to My Younger Self: You’re Not a Fraud

Letter to My Younger Self: About Burnout I Didn’t See Coming

Understanding PM2 Clustering Fundamentals

Single vs Cluster Mode Comparison

CPU Optimization Strategies

Determining Optimal Instance Count

Dynamic Instance Scaling

Load Balancing Deep Dive

Load Balancing Algorithms

Testing Load Distribution

Memory Management and Leak Prevention

Memory Monitoring Configuration

Memory Monitoring Commands

Performance Tuning Techniques

V8 Engine Optimization

Connection and Request Optimization

Performance Benchmarking

Load Testing Setup

Performance Monitoring Script

Resource Limits and Health Monitoring

Comprehensive Health Monitoring

Custom Health Check Implementation

What’s Next?

Like this:

You may like

Written by:

Chandan 439 Posts

Related Posts

Code Review at Scale: How Copilot Agent Reduces Technical Debt

Building Your First MCP Server with Node.js

Sequelize with PostgreSQL in Node.js: Part 4 – Production Ready Patterns

You May Have Missed

Letter to My Younger Self: You Don’t Have to Work Nights and Weekends

Letter to My Younger Self: It’s Okay to Say No

Letter to My Younger Self: You’re Not a Fraud

Letter to My Younger Self: About Burnout I Didn’t See Coming