Mobile-First Development: Creating Android and iOS Apps with MediaPipe Framework

Mobile-First Development: Creating Android and iOS Apps with MediaPipe Framework

Mobile devices have become the primary platform for computer vision applications, from social media filters to fitness tracking apps. MediaPipe’s mobile implementations for Android and iOS offer unmatched performance optimization, battery efficiency, and seamless integration with platform-specific features. Whether you’re building the next viral camera app or developing enterprise solutions, mastering mobile MediaPipe development is essential for reaching users where they spend most of their digital time.

Mobile-First Computer Vision Strategy

Mobile development with MediaPipe requires a different approach than web or desktop applications. Battery life, thermal management, varied hardware capabilities, and platform-specific user expectations all play crucial roles in creating successful mobile computer vision apps.

flowchart TD
    A[Mobile MediaPipe Strategy] --> B[Platform Selection]
    
    B --> C[Android Development]
    B --> D[iOS Development]
    B --> E[Cross-Platform Solutions]
    
    C --> F[Java/Kotlin Implementation]
    C --> G[Android Camera2 API]
    C --> H[GPU Acceleration]
    
    D --> I[Swift/Objective-C]
    D --> J[AVFoundation Framework]
    D --> K[Metal Performance]
    
    E --> L[Flutter Integration]
    E --> M[React Native Bridge]
    E --> N[Unity Plugin]
    
    F --> O[Google Play Store]
    I --> P[App Store]
    L --> Q[Multi-Platform Release]
    
    R[Mobile Optimization] --> S[Battery Efficiency]
    R --> T[Thermal Management]
    R --> U[Memory Usage]
    R --> V[Frame Rate Control]
    
    style A fill:#e3f2fd
    style O fill:#e8f5e8
    style P fill:#e8f5e8
    style Q fill:#e8f5e8
    style R fill:#fff3e0

Android Development Setup

Let’s start by setting up MediaPipe for Android development with proper architecture and optimization.

// MainActivity.kt
class MainActivity : AppCompatActivity() {
    companion object {
        private const val TAG = "MediaPipeApp"
        private const val CAMERA_PERMISSION_CODE = 100
    }
    
    private lateinit var processor: FrameProcessor
    private lateinit var eglManager: EglManager
    private lateinit var converter: ExternalTextureConverter
    private lateinit var cameraHelper: CameraXPreviewHelper
    private lateinit var previewDisplayView: SurfaceView
    
    private var currentSolution = "hand_tracking_mobile_gpu.binarypb"
    
    override fun onCreate(savedInstanceState: Bundle?) {
        super.onCreate(savedInstanceState)
        setContentView(R.layout.activity_main)
        
        initializeUI()
        checkCameraPermission()
    }
    
    private fun initializeUI() {
        previewDisplayView = findViewById(R.id.preview_display_view)
        setupSolutionButtons()
        
        previewDisplayView.holder.addCallback(object : SurfaceHolder.Callback {
            override fun surfaceCreated(holder: SurfaceHolder) {
                processor.videoSurfaceOutput.setSurface(holder.surface)
            }
            override fun surfaceChanged(holder: SurfaceHolder, format: Int, width: Int, height: Int) {
                // Handle surface changes
            }
            override fun surfaceDestroyed(holder: SurfaceHolder) {
                processor.videoSurfaceOutput.setSurface(null)
            }
        })
    }
    
    private fun setupSolutionButtons() {
        findViewById

iOS Development Implementation

Now let’s create the iOS version with Swift, optimized for iOS performance and user experience.

// ViewController.swift
import UIKit
import AVFoundation
import MediaPipeTasksVision

class ViewController: UIViewController {
    
    private var handLandmarker: HandLandmarker?
    private var faceLandmarker: FaceLandmarker?
    private var poseLandmarker: PoseLandmarker?
    
    private var captureSession: AVCaptureSession!
    private var videoDataOutput: AVCaptureVideoDataOutput!
    private var previewLayer: AVCaptureVideoPreviewLayer!
    
    @IBOutlet weak var previewView: UIView!
    @IBOutlet weak var overlayView: UIView!
    @IBOutlet weak var solutionControl: UISegmentedControl!
    
    private var currentSolution: Solution = .hands
    private var frameCount = 0
    private var lastTimestamp = CACurrentMediaTime()
    
    enum Solution: Int {
        case hands = 0
        case face = 1
        case pose = 2
    }
    
    override func viewDidLoad() {
        super.viewDidLoad()
        setupUI()
        initializeMediaPipe()
        setupCamera()
    }
    
    private func setupUI() {
        solutionControl.addTarget(self, action: #selector(solutionChanged), for: .valueChanged)
        overlayView.backgroundColor = UIColor.clear
    }
    
    @objc private func solutionChanged(_ sender: UISegmentedControl) {
        guard let newSolution = Solution(rawValue: sender.selectedSegmentIndex) else { return }
        currentSolution = newSolution
        clearOverlay()
    }
    
    private func initializeMediaPipe() {
        initializeHandLandmarker()
        initializeFaceLandmarker()
        initializePoseLandmarker()
    }
    
    private func initializeHandLandmarker() {
        guard let modelPath = Bundle.main.path(forResource: "hand_landmarker", ofType: "task") else {
            return
        }
        
        let options = HandLandmarkerOptions()
        options.baseOptions.modelAssetPath = modelPath
        options.runningMode = .liveStream
        options.numHands = 2
        options.minHandDetectionConfidence = 0.7
        options.handLandmarkerLiveStreamDelegate = self
        
        do {
            handLandmarker = try HandLandmarker(options: options)
        } catch {
            print("Failed to create HandLandmarker: \(error)")
        }
    }
    
    private func initializeFaceLandmarker() {
        guard let modelPath = Bundle.main.path(forResource: "face_landmarker", ofType: "task") else {
            return
        }
        
        let options = FaceLandmarkerOptions()
        options.baseOptions.modelAssetPath = modelPath
        options.runningMode = .liveStream
        options.faceLandmarkerLiveStreamDelegate = self
        
        do {
            faceLandmarker = try FaceLandmarker(options: options)
        } catch {
            print("Failed to create FaceLandmarker: \(error)")
        }
    }
    
    private func initializePoseLandmarker() {
        guard let modelPath = Bundle.main.path(forResource: "pose_landmarker", ofType: "task") else {
            return
        }
        
        let options = PoseLandmarkerOptions()
        options.baseOptions.modelAssetPath = modelPath
        options.runningMode = .liveStream
        options.poseLandmarkerLiveStreamDelegate = self
        
        do {
            poseLandmarker = try PoseLandmarker(options: options)
        } catch {
            print("Failed to create PoseLandmarker: \(error)")
        }
    }
    
    private func setupCamera() {
        captureSession = AVCaptureSession()
        captureSession.sessionPreset = .high
        
        guard let videoCaptureDevice = AVCaptureDevice.default(.builtInWideAngleCamera, 
                                                               for: .video, position: .front) else { return }
        
        do {
            let videoInput = try AVCaptureDeviceInput(device: videoCaptureDevice)
            if captureSession.canAddInput(videoInput) {
                captureSession.addInput(videoInput)
            }
        } catch {
            print("Error creating video input: \(error)")
            return
        }
        
        videoDataOutput = AVCaptureVideoDataOutput()
        videoDataOutput.setSampleBufferDelegate(self, queue: DispatchQueue(label: "videoQueue"))
        
        if captureSession.canAddOutput(videoDataOutput) {
            captureSession.addOutput(videoDataOutput)
        }
        
        previewLayer = AVCaptureVideoPreviewLayer(session: captureSession)
        previewLayer.frame = previewView.bounds
        previewLayer.videoGravity = .resizeAspectFill
        previewView.layer.addSublayer(previewLayer)
    }
    
    override func viewDidAppear(_ animated: Bool) {
        super.viewDidAppear(animated)
        DispatchQueue.global(qos: .userInitiated).async {
            self.captureSession.startRunning()
        }
    }
    
    private func clearOverlay() {
        DispatchQueue.main.async {
            self.overlayView.layer.sublayers?.removeAll()
        }
    }
    
    private func drawLandmarks(_ landmarks: [NormalizedLandmark]) {
        DispatchQueue.main.async {
            self.clearOverlay()
            
            let path = UIBezierPath()
            for landmark in landmarks {
                let point = CGPoint(
                    x: CGFloat(landmark.x) * self.overlayView.frame.width,
                    y: CGFloat(landmark.y) * self.overlayView.frame.height
                )
                path.move(to: point)
                path.addArc(withCenter: point, radius: 3, startAngle: 0, endAngle: .pi * 2, clockwise: true)
            }
            
            let shapeLayer = CAShapeLayer()
            shapeLayer.path = path.cgPath
            shapeLayer.fillColor = UIColor.red.cgColor
            
            self.overlayView.layer.addSublayer(shapeLayer)
        }
    }
}

extension ViewController: AVCaptureVideoDataOutputSampleBufferDelegate {
    func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, 
                      from connection: AVCaptureConnection) {
        
        guard let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else { return }
        
        let mpImage = try? MPImage(pixelBuffer: pixelBuffer)
        guard let image = mpImage else { return }
        
        let timestamp = Int(CMTimeGetSeconds(CMSampleBufferGetPresentationTimeStamp(sampleBuffer)) * 1000)
        
        switch currentSolution {
        case .hands:
            try? handLandmarker?.detectAsync(image: image, timestampInMilliseconds: timestamp)
        case .face:
            try? faceLandmarker?.detectAsync(image: image, timestampInMilliseconds: timestamp)
        case .pose:
            try? poseLandmarker?.detectAsync(image: image, timestampInMilliseconds: timestamp)
        }
    }
}

extension ViewController: HandLandmarkerLiveStreamDelegate {
    func handLandmarker(_ handLandmarker: HandLandmarker, didFinishDetection result: HandLandmarkerResult?, 
                       timestampInMilliseconds: Int, error: Error?) {
        guard let result = result, !result.landmarks.isEmpty else { return }
        
        if let landmarks = result.landmarks.first {
            drawLandmarks(landmarks)
        }
    }
}

extension ViewController: FaceLandmarkerLiveStreamDelegate {
    func faceLandmarker(_ faceLandmarker: FaceLandmarker, didFinishDetection result: FaceLandmarkerResult?, 
                       timestampInMilliseconds: Int, error: Error?) {
        guard let result = result, !result.faceLandmarks.isEmpty else { return }
        
        if let landmarks = result.faceLandmarks.first {
            drawLandmarks(landmarks)
        }
    }
}

extension ViewController: PoseLandmarkerLiveStreamDelegate {
    func poseLandmarker(_ poseLandmarker: PoseLandmarker, didFinishDetection result: PoseLandmarkerResult?, 
                       timestampInMilliseconds: Int, error: Error?) {
        guard let result = result, !result.landmarks.isEmpty else { return }
        
        if let landmarks = result.landmarks.first {
            drawLandmarks(landmarks)
        }
    }
}

Cross-Platform Development with Flutter

For teams targeting both platforms efficiently, Flutter provides an excellent cross-platform solution for MediaPipe integration.

// main.dart
import 'package:flutter/material.dart';
import 'package:camera/camera.dart';
import 'package:flutter/services.dart';

class MediaPipeApp extends StatefulWidget {
  @override
  _MediaPipeAppState createState() => _MediaPipeAppState();
}

class _MediaPipeAppState extends State {
  static const platform = MethodChannel('mediapipe_channel');
  
  CameraController? _controller;
  List? _cameras;
  bool _isInitialized = false;
  String _currentSolution = 'hands';
  
  @override
  void initState() {
    super.initState();
    _initializeApp();
  }
  
  Future _initializeApp() async {
    await _initializeCamera();
    await _initializeMediaPipe();
  }
  
  Future _initializeCamera() async {
    _cameras = await availableCameras();
    if (_cameras != null && _cameras!.isNotEmpty) {
      _controller = CameraController(
        _cameras![1], // Front camera
        ResolutionPreset.medium,
        enableAudio: false,
      );
      
      await _controller!.initialize();
      setState(() {
        _isInitialized = true;
      });
      
      _controller!.startImageStream(_processCameraImage);
    }
  }
  
  Future _initializeMediaPipe() async {
    try {
      await platform.invokeMethod('initialize');
    } on PlatformException catch (e) {
      print("Failed to initialize: ${e.message}");
    }
  }
  
  void _processCameraImage(CameraImage image) async {
    if (!_isInitialized) return;
    
    try {
      final result = await platform.invokeMethod('processImage', {
        'solution': _currentSolution,
        'width': image.width,
        'height': image.height,
        'format': image.format.group.name,
      });
      
      if (result != null) {
        _handleResults(result);
      }
    } catch (e) {
      print("Processing error: $e");
    }
  }
  
  void _handleResults(dynamic results) {
    setState(() {
      // Update UI with results
    });
  }
  
  Future _switchSolution(String solution) async {
    try {
      await platform.invokeMethod('switchSolution', {'solution': solution});
      setState(() {
        _currentSolution = solution;
      });
    } catch (e) {
      print("Switch error: $e");
    }
  }
  
  @override
  Widget build(BuildContext context) {
    return Scaffold(
      appBar: AppBar(
        title: Text('MediaPipe Mobile'),
        backgroundColor: Colors.blue,
      ),
      body: Column(
        children: [
          Expanded(
            child: _isInitialized
                ? CameraPreview(_controller!)
                : Center(child: CircularProgressIndicator()),
          ),
          _buildControlPanel(),
        ],
      ),
    );
  }
  
  Widget _buildControlPanel() {
    return Container(
      padding: EdgeInsets.all(16),
      child: Row(
        mainAxisAlignment: MainAxisAlignment.spaceEvenly,
        children: [
          _buildButton('Hands', 'hands'),
          _buildButton('Face', 'face'),
          _buildButton('Pose', 'pose'),
        ],
      ),
    );
  }
  
  Widget _buildButton(String title, String solution) {
    final isActive = _currentSolution == solution;
    return ElevatedButton(
      onPressed: () => _switchSolution(solution),
      style: ElevatedButton.styleFrom(
        backgroundColor: isActive ? Colors.blue : Colors.grey,
        foregroundColor: Colors.white,
      ),
      child: Text(title),
    );
  }
  
  @override
  void dispose() {
    _controller?.dispose();
    super.dispose();
  }
}

void main() async {
  WidgetsFlutterBinding.ensureInitialized();
  runApp(MaterialApp(home: MediaPipeApp()));
}

Mobile Performance Optimization

Mobile devices require specific optimization strategies to maintain smooth performance while preserving battery life.

  • Frame Rate Management: Adaptive frame rate based on device capabilities
  • Resolution Scaling: Dynamic resolution adjustment for performance
  • Model Selection: Choose appropriate model complexity for device
  • Memory Management: Efficient buffer reuse and garbage collection
  • Thermal Throttling: Reduce processing when device overheats
  • Battery Optimization: Smart background processing limitations

App Store Deployment

Android (Google Play)

  • App Bundle optimization
  • 64-bit architecture support
  • Privacy policy requirements
  • Target API level compliance
  • Performance testing on multiple devices

iOS (App Store)

  • App Review Guidelines adherence
  • Privacy nutrition labels
  • iOS compatibility testing
  • Metal performance optimization
  • TestFlight beta distribution

Monetization Strategies

Consider these proven approaches for monetizing MediaPipe mobile applications:

  • Freemium Model: Basic features free, premium effects require payment
  • In-App Purchases: Additional filters, backgrounds, or advanced features
  • Subscription Plans: Monthly/yearly access to premium capabilities
  • Enterprise Licensing: White-label solutions for business customers
  • Rewarded Advertising: Video ads unlock premium features temporarily

“Mobile-first computer vision is not just about adapting desktop solutions – it’s about reimagining what’s possible when AI meets the intimacy and ubiquity of smartphones.”

Mobile AI Development Report 2025

What’s Next: Advanced Topics and Custom Models

You’ve now mastered mobile MediaPipe development for both Android and iOS! In our next tutorial, we’ll explore advanced topics including custom model training, enterprise deployment strategies, and extending MediaPipe for specialized use cases.

Ready to launch your mobile computer vision app? Download our complete mobile development toolkit with platform-specific optimization guides, deployment checklists, and monetization templates.


This is Part 8 of our comprehensive MediaPipe series. Coming next: Advanced MediaPipe techniques, custom model training, and enterprise-scale deployment strategies!

Written by:

377 Posts

View All Posts
Follow Me :