Node.js applications are single-threaded by nature, but modern servers have multiple CPU cores. PM2’s clustering feature bridges this gap, allowing you to leverage all available computing power while providing built-in load balancing and fault tolerance. This comprehensive guide will transform your understanding of PM2 performance optimization.
Understanding PM2 Clustering Fundamentals
PM2 clustering creates multiple instances of your application, each running in its own process. The master process manages these worker processes and distributes incoming requests among them using a round-robin load balancer.
graph TD A[Incoming Requests] --> B[PM2 Master Process] B --> C[Load Balancer] C --> D[Worker 1Port 3000] C --> E[Worker 2Port 3000] C --> F[Worker 3Port 3000] C --> G[Worker 4Port 3000] D --> H[CPU Core 1] E --> I[CPU Core 2] F --> J[CPU Core 3] G --> K[CPU Core 4] L[Process Monitor] --> B L --> D L --> E L --> F L --> G
Single vs Cluster Mode Comparison
# Single instance (fork mode)
pm2 start app.js --name "single-app"
# Cluster mode with 4 instances
pm2 start app.js --name "cluster-app" -i 4
# Cluster mode using all CPU cores
pm2 start app.js --name "max-cluster" -i max
# Check the difference
pm2 list
graph TD A[Fork Mode] --> B[Single Process] A --> C[One CPU Core] A --> D[No Load Balancing] A --> E[Single Point of Failure]
graph TD F[Cluster Mode] --> G[Multiple Processes] F --> H[All CPU Cores] F --> I[Built-in Load Balancer] F --> J[High Availability]
CPU Optimization Strategies
Optimizing CPU utilization is crucial for maximizing your application’s performance. Here’s how to determine the optimal number of instances:
Determining Optimal Instance Count
# Check your server's CPU information
lscpu
nproc
cat /proc/cpuinfo | grep processor | wc -l
# Monitor CPU usage during different loads
htop
top
iostat -c 1
# Test different instance configurations
pm2 start app.js -i 1 --name "test-1"
pm2 start app.js -i 2 --name "test-2"
pm2 start app.js -i 4 --name "test-4"
pm2 start app.js -i max --name "test-max"
# Monitor performance
pm2 monit
Dynamic Instance Scaling
# Scale up instances dynamically
pm2 scale myapp +3
# Scale down instances
pm2 scale myapp -2
# Set specific number of instances
pm2 scale myapp 6
# Reset to maximum CPU cores
pm2 scale myapp max
# Automated scaling based on CPU usage (ecosystem.config.js)
module.exports = {
apps: [{
name: 'auto-scale-app',
script: 'app.js',
instances: 'max',
exec_mode: 'cluster',
// Auto-scaling configuration
min_uptime: '10s',
max_restarts: 10,
max_memory_restart: '1G',
// Performance monitoring
pmx: true,
automation: false,
env_production: {
NODE_ENV: 'production',
// V8 optimization flags
node_args: '--max-old-space-size=2048 --optimize-for-size'
}
}]
};
flowchart TD A[Monitor CPU Usage] --> B{CPU > 80%?} B -->|Yes| C[Scale Up Instances] B -->|No| D{CPU < 30%?} D -->|Yes| E[Scale Down Instances] D -->|No| F[Maintain Current Scale] C --> G[pm2 scale app +2] E --> H[pm2 scale app -1] F --> I[Continue Monitoring] G --> I H --> I I --> A
Load Balancing Deep Dive
PM2’s built-in load balancer distributes requests across your application instances. Understanding how it works helps you optimize for your specific use case:
Load Balancing Algorithms
// ecosystem.config.js - Load balancing configuration
module.exports = {
apps: [{
name: 'load-balanced-app',
script: 'server.js',
instances: 4,
exec_mode: 'cluster',
// Load balancing settings
instance_var: 'INSTANCE_ID', // Pass instance ID to app
env_production: {
NODE_ENV: 'production',
PORT: 3000,
// Enable session affinity for sticky sessions
SESSION_AFFINITY: 'true',
// Load balancer configuration
LB_METHOD: 'round_robin' // Default PM2 method
}
}]
};
Testing Load Distribution
# Create a test application that shows which instance handles requests
cat > load-test-app.js << 'EOF'
const express = require('express');
const app = express();
const port = process.env.PORT || 3000;
// Add instance information to responses
app.use((req, res, next) => {
res.instanceId = process.env.INSTANCE_ID || process.pid;
res.instanceUptime = process.uptime();
next();
});
app.get('/', (req, res) => {
res.json({
message: 'Load balancer test',
instanceId: res.instanceId,
processId: process.pid,
uptime: res.instanceUptime,
timestamp: new Date().toISOString(),
memoryUsage: process.memoryUsage()
});
});
app.get('/cpu-intensive', (req, res) => {
const start = Date.now();
// Simulate CPU-intensive task
let result = 0;
for (let i = 0; i < 1000000; i++) {
result += Math.random();
}
res.json({
instanceId: res.instanceId,
processId: process.pid,
result: result,
processingTime: Date.now() - start,
timestamp: new Date().toISOString()
});
});
app.listen(port, () => {
console.log(`Instance ${process.pid} listening on port ${port}`);
});
EOF
# Start with clustering
pm2 start load-test-app.js -i 4 --name "load-test"
# Test load distribution
for i in {1..10}; do
curl -s http://localhost:3000/ | jq '.instanceId, .processId'
done
sequenceDiagram participant C as Client participant LB as PM2 Load Balancer participant W1 as Worker 1 (PID: 1234) participant W2 as Worker 2 (PID: 1235) participant W3 as Worker 3 (PID: 1236) participant W4 as Worker 4 (PID: 1237) C->>LB: Request 1 LB->>W1: Forward to Worker 1 W1-->>C: Response (PID: 1234) C->>LB: Request 2 LB->>W2: Forward to Worker 2 W2-->>C: Response (PID: 1235) C->>LB: Request 3 LB->>W3: Forward to Worker 3 W3-->>C: Response (PID: 1236) C->>LB: Request 4 LB->>W4: Forward to Worker 4 W4-->>C: Response (PID: 1237) C->>LB: Request 5 LB->>W1: Forward to Worker 1 (Round Robin) W1-->>C: Response (PID: 1234)
Memory Management and Leak Prevention
Memory leaks can cripple your application’s performance. PM2 provides several mechanisms to detect and handle memory issues:
Memory Monitoring Configuration
// ecosystem.config.js - Memory management
module.exports = {
apps: [{
name: 'memory-managed-app',
script: 'app.js',
instances: 'max',
exec_mode: 'cluster',
// Memory management settings
max_memory_restart: '1G', // Restart if memory exceeds 1GB
min_uptime: '30s', // Min uptime before considering restart
max_restarts: 5, // Max restarts within min_uptime
restart_delay: 4000, // Delay between restarts
// V8 memory optimization
node_args: [
'--max-old-space-size=2048', // Increase heap size to 2GB
'--optimize-for-size', // Optimize for memory usage
'--gc-interval=100', // Garbage collection interval
'--expose-gc' // Expose garbage collection
].join(' '),
env_production: {
NODE_ENV: 'production',
// Memory profiling
NODE_OPTIONS: '--inspect',
// Enable heap snapshots
HEAP_SNAPSHOTS: 'true'
}
}]
};
Memory Monitoring Commands
# Monitor memory usage in real-time
pm2 monit
# Get detailed memory information
pm2 show myapp
# List processes with memory usage
pm2 list
# Memory usage over time
watch -n 1 'pm2 list'
# Enable memory profiling
pm2 start app.js --node-args="--inspect --expose-gc" --name "profiled-app"
# Generate heap snapshot
pm2 exec myapp -- node -e "
if (global.gc) {
global.gc();
console.log('Memory after GC:', process.memoryUsage());
}
"
# Check for memory leaks using custom endpoint
cat > memory-monitor.js << 'EOF'
const express = require('express');
const app = express();
app.get('/memory', (req, res) => {
const usage = process.memoryUsage();
const formatBytes = (bytes) => (bytes / 1024 / 1024).toFixed(2) + ' MB';
res.json({
rss: formatBytes(usage.rss), // Resident Set Size
heapTotal: formatBytes(usage.heapTotal), // Total heap size
heapUsed: formatBytes(usage.heapUsed), // Used heap size
external: formatBytes(usage.external), // External memory
arrayBuffers: formatBytes(usage.arrayBuffers),
uptime: process.uptime()
});
});
app.get('/gc', (req, res) => {
if (global.gc) {
const before = process.memoryUsage();
global.gc();
const after = process.memoryUsage();
res.json({
before: {
heapUsed: (before.heapUsed / 1024 / 1024).toFixed(2) + ' MB'
},
after: {
heapUsed: (after.heapUsed / 1024 / 1024).toFixed(2) + ' MB'
},
freed: ((before.heapUsed - after.heapUsed) / 1024 / 1024).toFixed(2) + ' MB'
});
} else {
res.json({ error: 'GC not exposed. Start with --expose-gc' });
}
});
app.listen(3000, () => {
console.log('Memory monitor running on port 3000');
});
EOF
graph TD A[Application Start] --> B[Monitor Memory Usage] B --> C{Memory > Threshold?} C -->|No| D[Continue Normal Operation] C -->|Yes| E[Log Memory Warning] E --> F{Memory > Max Limit?} F -->|No| G[Trigger Garbage Collection] F -->|Yes| H[PM2 Restart Process] G --> I[Check Memory After GC] I --> J{Memory Still High?} J -->|Yes| H J -->|No| D H --> K[New Process Instance] K --> B D --> B
Performance Tuning Techniques
Beyond clustering and memory management, several advanced techniques can significantly boost your application’s performance:
V8 Engine Optimization
// ecosystem.config.js - V8 optimization
module.exports = {
apps: [{
name: 'optimized-app',
script: 'app.js',
instances: 'max',
exec_mode: 'cluster',
// V8 optimization flags
node_args: [
// Memory optimization
'--max-old-space-size=4096', // 4GB heap limit
'--max-new-space-size=2048', // New generation space
'--max-semi-space-size=512', // Semi space size
// Performance optimization
'--optimize-for-size', // Optimize for memory over speed
'--always-compact', // Always perform compaction
'--expose-gc', // Enable manual GC
// Advanced V8 flags
'--turbo-inline-api-calls', // Inline API calls
'--use-osr', // On-stack replacement
'--trace-opt', // Trace optimizations (debug only)
'--trace-deopt' // Trace deoptimizations (debug only)
].join(' '),
env_production: {
NODE_ENV: 'production',
UV_THREADPOOL_SIZE: 16, // Increase thread pool size
NODE_OPTIONS: '--enable-source-maps'
}
}]
};
Connection and Request Optimization
// High-performance server configuration
const express = require('express');
const compression = require('compression');
const helmet = require('helmet');
const app = express();
// Enable compression
app.use(compression({
level: 6, // Compression level (0-9)
threshold: 1024, // Only compress responses > 1KB
filter: (req, res) => {
if (req.headers['x-no-compression']) {
return false;
}
return compression.filter(req, res);
}
}));
// Security headers
app.use(helmet());
// Keep-alive configuration
app.use((req, res, next) => {
res.setHeader('Connection', 'keep-alive');
res.setHeader('Keep-Alive', 'timeout=65');
next();
});
// Performance monitoring middleware
app.use((req, res, next) => {
const start = process.hrtime.bigint();
res.on('finish', () => {
const end = process.hrtime.bigint();
const duration = Number(end - start) / 1000000; // Convert to milliseconds
console.log(`${req.method} ${req.path} - ${res.statusCode} - ${duration.toFixed(2)}ms`);
// Log slow requests
if (duration > 1000) {
console.warn(`Slow request detected: ${req.method} ${req.path} took ${duration.toFixed(2)}ms`);
}
});
next();
});
// Cluster-aware session handling
if (process.env.NODE_ENV === 'production') {
const session = require('express-session');
const RedisStore = require('connect-redis')(session);
const redis = require('redis');
const client = redis.createClient();
app.use(session({
store: new RedisStore({ client }),
secret: process.env.SESSION_SECRET,
resave: false,
saveUninitialized: false,
cookie: {
secure: false, // Set to true with HTTPS
maxAge: 24 * 60 * 60 * 1000 // 24 hours
}
}));
}
// Health check endpoint
app.get('/health', (req, res) => {
const memUsage = process.memoryUsage();
res.json({
status: 'healthy',
uptime: process.uptime(),
memory: {
rss: Math.round(memUsage.rss / 1024 / 1024) + ' MB',
heapUsed: Math.round(memUsage.heapUsed / 1024 / 1024) + ' MB',
heapTotal: Math.round(memUsage.heapTotal / 1024 / 1024) + ' MB'
},
processId: process.pid,
nodeVersion: process.version
});
});
const port = process.env.PORT || 3000;
const server = app.listen(port, () => {
console.log(`Optimized server running on port ${port} (PID: ${process.pid})`);
});
// Optimize server settings
server.keepAliveTimeout = 65000;
server.headersTimeout = 66000;
server.timeout = 120000;
Performance Benchmarking
Measuring performance is crucial for optimization. Here’s how to benchmark your PM2 applications:
Load Testing Setup
# Install load testing tools
npm install -g autocannon
npm install -g loadtest
# Basic load test with autocannon
autocannon -c 100 -d 60 http://localhost:3000
# Advanced load test
autocannon \
--connections 200 \
--duration 120 \
--pipelining 10 \
--method GET \
--headers "User-Agent=LoadTest" \
http://localhost:3000
# Test different instance configurations
echo "Testing single instance..."
pm2 delete all
pm2 start app.js --name "single" -i 1
sleep 5
autocannon -c 100 -d 30 http://localhost:3000 > single-instance.txt
echo "Testing cluster mode..."
pm2 delete all
pm2 start app.js --name "cluster" -i max
sleep 5
autocannon -c 100 -d 30 http://localhost:3000 > cluster-mode.txt
# Compare results
echo "=== Single Instance Results ==="
cat single-instance.txt | grep -E "(requests|latency|throughput)"
echo "=== Cluster Mode Results ==="
cat cluster-mode.txt | grep -E "(requests|latency|throughput)"
Performance Monitoring Script
#!/bin/bash
# performance-monitor.sh
APP_NAME="myapp"
DURATION=300 # 5 minutes
LOG_FILE="performance-$(date +%Y%m%d_%H%M%S).log"
echo "Starting performance monitoring for $APP_NAME" | tee $LOG_FILE
echo "Duration: $DURATION seconds" | tee -a $LOG_FILE
echo "Timestamp: $(date)" | tee -a $LOG_FILE
echo "================================" | tee -a $LOG_FILE
# Function to get PM2 stats
get_pm2_stats() {
pm2 jlist | jq -r '
.[] | select(.name == "'$APP_NAME'") |
"Instance: \(.pm_id) | Status: \(.pm2_env.status) | Memory: \(.memory) | CPU: \(.cpu)%"
'
}
# Function to get system stats
get_system_stats() {
echo "CPU: $(top -bn1 | grep "Cpu(s)" | awk '{print $2}' | cut -d'%' -f1)%"
echo "Memory: $(free -m | awk 'NR==2{printf "%.1f%%", $3*100/$2}')"
echo "Load: $(uptime | awk -F'load average:' '{print $2}')"
}
# Start monitoring
for i in $(seq 1 $((DURATION/10))); do
echo "=== Measurement $i ($(date)) ===" | tee -a $LOG_FILE
get_pm2_stats | tee -a $LOG_FILE
get_system_stats | tee -a $LOG_FILE
echo "" | tee -a $LOG_FILE
sleep 10
done
echo "Performance monitoring completed. Results saved to $LOG_FILE"
graph TB A[Start Load Test] --> B[Single Instance Test] B --> C[Record Metrics] C --> D[Cluster Mode Test] D --> E[Record Metrics] E --> F[Compare Results] F --> G{Performance Improved?} G -->|Yes| H[Deploy Cluster Config] G -->|No| I[Analyze Bottlenecks] I --> J[CPU Bound?] I --> K[Memory Bound?] I --> L[I/O Bound?] J --> M[Increase Instances] K --> N[Optimize Memory Usage] L --> O[Optimize Database/Network] M --> P[Re-test] N --> P O --> P P --> F
Resource Limits and Health Monitoring
Setting appropriate resource limits and implementing health checks ensures your applications remain stable under load:
Comprehensive Health Monitoring
// ecosystem.config.js - Health monitoring
module.exports = {
apps: [{
name: 'health-monitored-app',
script: 'app.js',
instances: 'max',
exec_mode: 'cluster',
// Resource limits
max_memory_restart: '1G',
max_restarts: 5,
min_uptime: '30s',
restart_delay: 4000,
// Health monitoring
health_check_url: 'http://localhost:3000/health',
health_check_interval: 30000, // 30 seconds
health_check_timeout: 5000, // 5 seconds
// Advanced monitoring
pmx: true,
automation: false,
env_production: {
NODE_ENV: 'production',
HEALTH_CHECK_ENABLED: 'true',
MONITORING_INTERVAL: '10000'
}
}]
};
Custom Health Check Implementation
// health-monitor.js
const express = require('express');
const app = express();
// Health check state
let healthStatus = {
status: 'healthy',
lastCheck: new Date(),
checks: {
database: 'unknown',
redis: 'unknown',
external_api: 'unknown'
},
metrics: {
uptime: 0,
memory: {},
requestCount: 0,
errorCount: 0
}
};
// Middleware to count requests and errors
app.use((req, res, next) => {
healthStatus.metrics.requestCount++;
res.on('finish', () => {
if (res.statusCode >= 400) {
healthStatus.metrics.errorCount++;
}
});
next();
});
// Comprehensive health check
app.get('/health', async (req, res) => {
const startTime = Date.now();
try {
// Update metrics
healthStatus.metrics.uptime = process.uptime();
healthStatus.metrics.memory = process.memoryUsage();
healthStatus.lastCheck = new Date();
// Check database connection
healthStatus.checks.database = await checkDatabase();
// Check Redis connection
healthStatus.checks.redis = await checkRedis();
// Check external API
healthStatus.checks.external_api = await checkExternalAPI();
// Determine overall status
const allHealthy = Object.values(healthStatus.checks).every(status => status === 'healthy');
healthStatus.status = allHealthy ? 'healthy' : 'degraded';
const responseTime = Date.now() - startTime;
res.status(healthStatus.status === 'healthy' ? 200 : 503).json({
...healthStatus,
responseTime: responseTime + 'ms',
processId: process.pid,
nodeVersion: process.version
});
} catch (error) {
healthStatus.status = 'unhealthy';
res.status(503).json({
status: 'unhealthy',
error: error.message,
processId: process.pid,
timestamp: new Date()
});
}
});
// Individual health check functions
async function checkDatabase() {
try {
// Implement your database check here
// Example: await db.ping();
return 'healthy';
} catch (error) {
console.error('Database health check failed:', error);
return 'unhealthy';
}
}
async function checkRedis() {
try {
// Implement your Redis check here
// Example: await redis.ping();
return 'healthy';
} catch (error) {
console.error('Redis health check failed:', error);
return 'unhealthy';
}
}
async function checkExternalAPI() {
try {
// Implement external API check here
// Example: await fetch('https://api.example.com/health');
return 'healthy';
} catch (error) {
console.error('External API health check failed:', error);
return 'unhealthy';
}
}
// Graceful shutdown handling
process.on('SIGTERM', () => {
console.log('Received SIGTERM, shutting down gracefully');
healthStatus.status = 'shutting_down';
server.close(() => {
console.log('Process terminated');
process.exit(0);
});
});
const port = process.env.PORT || 3000;
const server = app.listen(port, () => {
console.log(`Health monitored server running on port ${port} (PID: ${process.pid})`);
});
What’s Next?
You now have comprehensive knowledge of PM2 clustering and performance optimization. You can leverage multiple CPU cores, implement load balancing, manage memory effectively, and monitor application health. In the next part of this series, we’ll explore production-ready deployment with systemd integration:
- Deep dive into systemd service management
- PM2 startup and auto-restart configuration
- Security hardening and user management
- Production deployment best practices
- Service monitoring and maintenance
The performance optimization techniques you’ve learned will be essential for the production deployment strategies we’ll cover next.
Series Navigation:
← Part 2: Configuration Mastery
→ Part 3: Clustering and Performance (You are here)
→ Part 4: Production Systemd Integration (Coming next)