The web has become the ultimate platform for reaching users across all devices, and computer vision is no exception. MediaPipe’s JavaScript implementation brings powerful AI capabilities directly to browsers, eliminating the need for app downloads or installations. From interactive websites that respond to gestures to progressive web apps with advanced camera features, browser-based computer vision opens up incredible possibilities for developers who want to reach the widest possible audience with cutting-edge technology.
Why MediaPipe on the Web Matters
Web-based computer vision offers unique advantages over native applications: instant accessibility, cross-platform compatibility, easy updates, and the ability to leverage the web’s vast ecosystem. MediaPipe’s JavaScript implementation makes these benefits available without sacrificing performance or functionality.
flowchart TD A[User Browser] --> B[MediaPipe JavaScript] B --> C[WebGL Acceleration] C --> D[Computer Vision Processing] D --> E[Hand Tracking] D --> F[Face Detection] D --> G[Pose Estimation] D --> H[Selfie Segmentation] I[Web Technologies] --> J[HTML5 Canvas] I --> K[WebRTC Camera Access] I --> L[Web Workers] I --> M[Progressive Web Apps] E --> N[Interactive Web Apps] F --> N G --> N H --> N J --> O[Cross-Platform Deployment] K --> O L --> O M --> O N --> P[No Installation Required] O --> Q[Instant Updates] P --> R[Maximum Reach] Q --> R style A fill:#e3f2fd style R fill:#e8f5e8 style D fill:#fff3e0 style I fill:#f3e5f5
Setting Up MediaPipe for Web Development
Let’s start by creating a robust foundation for web-based computer vision applications with proper setup and optimization techniques.
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>MediaPipe Web Computer Vision</title>
<style>
body {
margin: 0;
padding: 20px;
font-family: Arial, sans-serif;
background: #1a1a1a;
color: white;
}
.container {
max-width: 1200px;
margin: 0 auto;
text-align: center;
}
.video-container {
position: relative;
display: inline-block;
margin: 20px;
}
video, canvas {
max-width: 100%;
height: auto;
border-radius: 10px;
box-shadow: 0 4px 20px rgba(0,0,0,0.3);
}
.controls {
margin: 20px 0;
}
.btn {
background: #4CAF50;
color: white;
border: none;
padding: 12px 24px;
margin: 5px;
border-radius: 5px;
cursor: pointer;
font-size: 16px;
transition: background 0.3s;
}
.btn:hover {
background: #45a049;
}
.btn.active {
background: #FF6B6B;
}
.stats {
background: rgba(255,255,255,0.1);
padding: 15px;
border-radius: 10px;
margin: 20px 0;
text-align: left;
}
</style>
</head>
<body>
<div class="container">
<h1>MediaPipe Web Computer Vision Demo</h1>
<div class="video-container">
<video id="inputVideo" autoplay muted playsinline></video>
<canvas id="outputCanvas"></canvas>
</div>
<div class="controls">
<button class="btn" onclick="toggleSolution('hands')">Hand Tracking</button>
<button class="btn" onclick="toggleSolution('face')">Face Detection</button>
<button class="btn" onclick="toggleSolution('pose')">Pose Estimation</button>
<button class="btn" onclick="toggleSolution('selfie')">Background Effects</button>
</div>
<div class="stats" id="stats">
<div>FPS: <span id="fps">0</span></div>
<div>Active Solution: <span id="activeSolution">None</span></div>
<div>Landmarks Detected: <span id="landmarkCount">0</span></div>
</div>
</div>
<!-- MediaPipe JavaScript Libraries -->
<script src="https://cdn.jsdelivr.net/npm/@mediapipe/camera_utils/camera_utils.js"></script>
<script src="https://cdn.jsdelivr.net/npm/@mediapipe/control_utils/control_utils.js"></script>
<script src="https://cdn.jsdelivr.net/npm/@mediapipe/drawing_utils/drawing_utils.js"></script>
<script src="https://cdn.jsdelivr.net/npm/@mediapipe/hands/hands.js"></script>
<script src="https://cdn.jsdelivr.net/npm/@mediapipe/face_detection/face_detection.js"></script>
<script src="https://cdn.jsdelivr.net/npm/@mediapipe/pose/pose.js"></script>
<script src="https://cdn.jsdelivr.net/npm/@mediapipe/selfie_segmentation/selfie_segmentation.js"></script>
Building a Multi-Solution Web Application
Let’s create a comprehensive web application that can switch between different MediaPipe solutions dynamically, showcasing the full power of browser-based computer vision.
class MediaPipeWebApp {
constructor() {
this.videoElement = document.getElementById('inputVideo');
this.canvasElement = document.getElementById('outputCanvas');
this.canvasCtx = this.canvasElement.getContext('2d');
this.currentSolution = null;
this.solutions = {};
this.camera = null;
// Performance monitoring
this.frameCount = 0;
this.lastTime = performance.now();
this.fps = 0;
this.initializeSolutions();
this.setupCamera();
}
initializeSolutions() {
// Initialize Hands solution
this.solutions.hands = new Hands({
locateFile: (file) => {
return `https://cdn.jsdelivr.net/npm/@mediapipe/hands/${file}`;
}
});
this.solutions.hands.setOptions({
maxNumHands: 2,
modelComplexity: 1,
minDetectionConfidence: 0.7,
minTrackingConfidence: 0.5
});
this.solutions.hands.onResults(this.onHandsResults.bind(this));
// Initialize Face Detection solution
this.solutions.face = new FaceDetection({
locateFile: (file) => {
return `https://cdn.jsdelivr.net/npm/@mediapipe/face_detection/${file}`;
}
});
this.solutions.face.setOptions({
model: 'short',
minDetectionConfidence: 0.7
});
this.solutions.face.onResults(this.onFaceResults.bind(this));
// Initialize Pose solution
this.solutions.pose = new Pose({
locateFile: (file) => {
return `https://cdn.jsdelivr.net/npm/@mediapipe/pose/${file}`;
}
});
this.solutions.pose.setOptions({
modelComplexity: 1,
smoothLandmarks: true,
enableSegmentation: false,
smoothSegmentation: true,
minDetectionConfidence: 0.5,
minTrackingConfidence: 0.5
});
this.solutions.pose.onResults(this.onPoseResults.bind(this));
// Initialize Selfie Segmentation solution
this.solutions.selfie = new SelfieSegmentation({
locateFile: (file) => {
return `https://cdn.jsdelivr.net/npm/@mediapipe/selfie_segmentation/${file}`;
}
});
this.solutions.selfie.setOptions({
modelSelection: 1
});
this.solutions.selfie.onResults(this.onSelfieResults.bind(this));
}
setupCamera() {
this.camera = new Camera(this.videoElement, {
onFrame: async () => {
if (this.currentSolution && this.solutions[this.currentSolution]) {
await this.solutions[this.currentSolution].send({
image: this.videoElement
});
}
this.updateFPS();
},
width: 1280,
height: 720
});
}
onHandsResults(results) {
this.clearCanvas();
this.drawImage(results.image);
let landmarkCount = 0;
if (results.multiHandLandmarks) {
for (const landmarks of results.multiHandLandmarks) {
drawConnectors(this.canvasCtx, landmarks, HAND_CONNECTIONS, {
color: '#00FF00', lineWidth: 2
});
drawLandmarks(this.canvasCtx, landmarks, {
color: '#FF0000', lineWidth: 1, radius: 3
});
landmarkCount += landmarks.length;
}
}
this.updateStats('Hand Tracking', landmarkCount);
}
onFaceResults(results) {
this.clearCanvas();
this.drawImage(results.image);
let landmarkCount = 0;
if (results.detections) {
for (const detection of results.detections) {
drawRectangle(this.canvasCtx, detection.boundingBox, {
color: 'red', lineWidth: 4, fillColor: 'rgba(255,0,0,0.1)'
});
if (detection.landmarks) {
drawLandmarks(this.canvasCtx, detection.landmarks, {
color: '#FFD700', radius: 5
});
landmarkCount += detection.landmarks.length;
}
}
}
this.updateStats('Face Detection', landmarkCount);
}
onPoseResults(results) {
this.clearCanvas();
this.drawImage(results.image);
let landmarkCount = 0;
if (results.poseLandmarks) {
drawConnectors(this.canvasCtx, results.poseLandmarks, POSE_CONNECTIONS, {
color: '#00CCFF', lineWidth: 4
});
drawLandmarks(this.canvasCtx, results.poseLandmarks, {
color: '#FF00FF', lineWidth: 2, radius: 6
});
landmarkCount = results.poseLandmarks.length;
}
this.updateStats('Pose Estimation', landmarkCount);
}
onSelfieResults(results) {
this.clearCanvas();
// Apply background blur effect
this.canvasCtx.save();
this.canvasCtx.clearRect(0, 0, this.canvasElement.width, this.canvasElement.height);
// Draw blurred background
this.canvasCtx.filter = 'blur(10px)';
this.canvasCtx.drawImage(results.image, 0, 0,
this.canvasElement.width, this.canvasElement.height);
// Draw person (unblurred)
this.canvasCtx.filter = 'none';
this.canvasCtx.globalCompositeOperation = 'source-in';
this.canvasCtx.drawImage(results.segmentationMask, 0, 0,
this.canvasElement.width, this.canvasElement.height);
this.canvasCtx.globalCompositeOperation = 'source-over';
this.canvasCtx.drawImage(results.image, 0, 0,
this.canvasElement.width, this.canvasElement.height);
this.canvasCtx.restore();
this.updateStats('Background Segmentation', 1);
}
clearCanvas() {
this.canvasCtx.save();
this.canvasCtx.clearRect(0, 0, this.canvasElement.width, this.canvasElement.height);
this.canvasCtx.restore();
}
drawImage(image) {
this.canvasElement.width = image.width;
this.canvasElement.height = image.height;
this.canvasCtx.drawImage(image, 0, 0, image.width, image.height);
}
updateFPS() {
this.frameCount++;
const currentTime = performance.now();
if (currentTime - this.lastTime >= 1000) {
this.fps = Math.round((this.frameCount * 1000) / (currentTime - this.lastTime));
this.frameCount = 0;
this.lastTime = currentTime;
document.getElementById('fps').textContent = this.fps;
}
}
updateStats(solutionName, landmarkCount) {
document.getElementById('activeSolution').textContent = solutionName;
document.getElementById('landmarkCount').textContent = landmarkCount;
}
toggleSolution(solutionName) {
// Update UI
document.querySelectorAll('.btn').forEach(btn => {
btn.classList.remove('active');
});
if (this.currentSolution === solutionName) {
// Turn off current solution
this.currentSolution = null;
this.clearCanvas();
this.updateStats('None', 0);
} else {
// Switch to new solution
this.currentSolution = solutionName;
event.target.classList.add('active');
}
}
async startCamera() {
try {
await this.camera.start();
console.log('Camera started successfully');
} catch (error) {
console.error('Error starting camera:', error);
alert('Unable to access camera. Please check permissions.');
}
}
}
Interactive Web Features and UI Enhancements
Let’s add advanced web-specific features that take advantage of modern browser capabilities.
// Enhanced Web Features
class EnhancedWebFeatures extends MediaPipeWebApp {
constructor() {
super();
this.gestureHistory = [];
this.touchEvents = [];
this.setupAdvancedFeatures();
}
setupAdvancedFeatures() {
// Fullscreen API
this.setupFullscreenControls();
// Screen recording API
this.setupScreenRecording();
// Notification API
this.setupNotifications();
// Touch events for mobile
this.setupTouchControls();
// Keyboard shortcuts
this.setupKeyboardShortcuts();
}
setupFullscreenControls() {
const fullscreenBtn = document.createElement('button');
fullscreenBtn.className = 'btn';
fullscreenBtn.textContent = 'Fullscreen';
fullscreenBtn.onclick = this.toggleFullscreen.bind(this);
document.querySelector('.controls').appendChild(fullscreenBtn);
}
toggleFullscreen() {
if (!document.fullscreenElement) {
document.documentElement.requestFullscreen().catch(err => {
console.log(`Error attempting to enable fullscreen: ${err.message}`);
});
} else {
document.exitFullscreen();
}
}
setupScreenRecording() {
const recordBtn = document.createElement('button');
recordBtn.className = 'btn';
recordBtn.textContent = 'Record';
recordBtn.onclick = this.toggleRecording.bind(this);
document.querySelector('.controls').appendChild(recordBtn);
this.mediaRecorder = null;
this.recordedChunks = [];
this.isRecording = false;
}
async toggleRecording() {
if (!this.isRecording) {
await this.startRecording();
} else {
this.stopRecording();
}
}
async startRecording() {
try {
const stream = this.canvasElement.captureStream(30);
this.mediaRecorder = new MediaRecorder(stream);
this.mediaRecorder.ondataavailable = (event) => {
if (event.data.size > 0) {
this.recordedChunks.push(event.data);
}
};
this.mediaRecorder.onstop = () => {
this.saveRecording();
};
this.mediaRecorder.start();
this.isRecording = true;
document.querySelector('button[onclick*="toggleRecording"]').textContent = 'Stop Recording';
} catch (error) {
console.error('Error starting recording:', error);
}
}
stopRecording() {
if (this.mediaRecorder && this.isRecording) {
this.mediaRecorder.stop();
this.isRecording = false;
document.querySelector('button[onclick*="toggleRecording"]').textContent = 'Record';
}
}
saveRecording() {
const blob = new Blob(this.recordedChunks, { type: 'video/webm' });
const url = URL.createObjectURL(blob);
const a = document.createElement('a');
a.href = url;
a.download = `mediapipe-recording-${Date.now()}.webm`;
a.click();
URL.revokeObjectURL(url);
this.recordedChunks = [];
}
setupNotifications() {
if ('Notification' in window) {
Notification.requestPermission();
}
}
showNotification(title, message) {
if (Notification.permission === 'granted') {
new Notification(title, {
body: message,
icon: '/favicon.ico'
});
}
}
setupTouchControls() {
let touchStartX = 0;
let touchStartY = 0;
this.canvasElement.addEventListener('touchstart', (e) => {
touchStartX = e.touches[0].clientX;
touchStartY = e.touches[0].clientY;
});
this.canvasElement.addEventListener('touchend', (e) => {
const touchEndX = e.changedTouches[0].clientX;
const touchEndY = e.changedTouches[0].clientY;
const deltaX = touchEndX - touchStartX;
const deltaY = touchEndY - touchStartY;
// Detect swipe gestures
if (Math.abs(deltaX) > 50) {
if (deltaX > 0) {
this.switchToNextSolution();
} else {
this.switchToPrevSolution();
}
}
});
}
setupKeyboardShortcuts() {
document.addEventListener('keydown', (e) => {
switch(e.key) {
case '1':
this.toggleSolution('hands');
break;
case '2':
this.toggleSolution('face');
break;
case '3':
this.toggleSolution('pose');
break;
case '4':
this.toggleSolution('selfie');
break;
case 'f':
this.toggleFullscreen();
break;
case 'r':
this.toggleRecording();
break;
case ' ':
e.preventDefault();
this.takeScreenshot();
break;
}
});
}
takeScreenshot() {
const link = document.createElement('a');
link.download = `mediapipe-screenshot-${Date.now()}.png`;
link.href = this.canvasElement.toDataURL();
link.click();
this.showNotification('Screenshot Saved', 'Your MediaPipe screenshot has been downloaded');
}
switchToNextSolution() {
const solutions = ['hands', 'face', 'pose', 'selfie'];
const currentIndex = solutions.indexOf(this.currentSolution);
const nextIndex = (currentIndex + 1) % solutions.length;
this.toggleSolution(solutions[nextIndex]);
}
switchToPrevSolution() {
const solutions = ['hands', 'face', 'pose', 'selfie'];
const currentIndex = solutions.indexOf(this.currentSolution);
const prevIndex = currentIndex <= 0 ? solutions.length - 1 : currentIndex - 1;
this.toggleSolution(solutions[prevIndex]);
}
}
// Initialize application
let app;
document.addEventListener('DOMContentLoaded', async () => {
app = new EnhancedWebFeatures();
await app.startCamera();
});
// Global function for button clicks
function toggleSolution(solutionName) {
if (app) {
app.toggleSolution(solutionName);
}
}
Progressive Web App Implementation
Transform your MediaPipe web application into a Progressive Web App for a native app-like experience.
{
"name": "MediaPipe Computer Vision",
"short_name": "MediaPipeCV",
"description": "Advanced computer vision in your browser",
"start_url": "/",
"display": "standalone",
"background_color": "#1a1a1a",
"theme_color": "#4CAF50",
"icons": [
{
"src": "/icon-192.png",
"sizes": "192x192",
"type": "image/png"
},
{
"src": "/icon-512.png",
"sizes": "512x512",
"type": "image/png"
}
],
"permissions": ["camera"],
"categories": ["utilities", "photo"]
}
Cross-Platform Considerations
Ensure your web application works seamlessly across different browsers and devices.
- Browser Compatibility: Test on Chrome, Firefox, Safari, and Edge
- Mobile Optimization: Responsive design and touch-friendly controls
- Performance Scaling: Adjust quality based on device capabilities
- Fallback Options: Graceful degradation for unsupported features
- Loading States: Clear feedback during model loading
Deployment and Hosting
Static Hosting Options
- Netlify for automatic deployments
- Vercel for optimized performance
- GitHub Pages for open source projects
- Firebase Hosting for Google integration
Performance Optimization
- CDN integration for MediaPipe libraries
- Service workers for offline functionality
- Resource preloading and caching
- Bundle optimization and tree shaking
“The web platform’s universality combined with MediaPipe’s power creates unprecedented opportunities for developers to reach billions of users with advanced computer vision capabilities.”
Web Platform Developer Survey 2025
What’s Next: Mobile App Development
You’ve now mastered building browser-based computer vision applications with MediaPipe! In our next tutorial, we’ll dive into mobile app development, exploring how to create native Android and iOS applications with MediaPipe for maximum performance and platform integration.
Ready to reach users across all platforms? Download our complete web development toolkit with PWA templates, cross-browser compatibility guides, and deployment automation scripts.
This is Part 7 of our comprehensive MediaPipe series. Coming next: Native mobile app development for Android and iOS with platform-specific optimizations!