FrameFind

On-device glasses detection for modern web apps.

Real-time face attribute detection using landmarks and a lightweight ONNX model. Runs entirely in the browser or Node.js. Zero backend required.

View Source
6.2MB Model
Sub-100ms Inference
WASM / WebGPU
import { useRef } from 'react';
import { useGlassesDetector } from '@framefind/react';

export function App() {
  const videoRef = useRef<HTMLVideoElement>(null);
  const canvasRef = useRef<HTMLCanvasElement>(null);
  const { result, detect } = useGlassesDetector({
    modelUrl: '/models/glasses.onnx',
  });

  return (
    <div>
      <video ref={videoRef} autoPlay playsInline muted
        onPlay={() => setInterval(() => {
          if (videoRef.current && canvasRef.current)
            detect(videoRef.current, canvasRef.current);
        }, 66)}
      />
      <canvas ref={canvasRef} hidden />
      <p>{result?.glasses ? 'πŸ•Ά Glasses' : 'No glasses'}</p>
    </div>
  );
}

The problem with existing APIs

Cloud Latency is Too High

Sending 30 frames per second over the network to cloud APIs causes severe lag. Edge-inference guarantees sub-100ms real-time feedback.

Privacy & Compliance

Transmitting user face data to servers creates enormous GDPR/CCPA overhead. On-device inference means data never leaves the browser.

Bloated Dependencies

Traditional ML suites export massive WASM bundles. FrameFind uses targeted MediaPipe landmark indices and a specialized 6.2MB model.

How it works

A lean, optimized pipeline utilizing ONNX Web Runtime.

Input mediaCamera stream or image
FaceMeshExtract landmarks
ROI CropIsolate eye region
ONNX inferenceLocal execution
ResultSmoothed probability

Quick start

FrameFind is modular. Install the core SDK for raw JS environments, or the React bindings for hook-based state management.

Core Engine
bash
npm install @framefind/core onnxruntime-web
React Support
bash
npm install @framefind/react
Node.js Server
bash
npm install onnxruntime-node
import { useRef, useEffect } from 'react';
import { useGlassesDetector } from '@framefind/react';

export default function App() {
  const videoRef = useRef<HTMLVideoElement>(null);
  const canvasRef = useRef<HTMLCanvasElement>(null);

  const { result, loading, error, detect, reset } = useGlassesDetector({
    modelUrl: '/models/glasses.onnx',
    threshold: 0.35,
    smoothingWindow: 8,
  });

  useEffect(() => {
    navigator.mediaDevices.getUserMedia({ video: true }).then((stream) => {
      if (videoRef.current) videoRef.current.srcObject = stream;
    });
  }, []);

  return (
    <div className="p-4">
      <video ref={videoRef} autoPlay playsInline muted
        onPlay={() => {
          setInterval(() => {
            if (videoRef.current && canvasRef.current)
              detect(videoRef.current, canvasRef.current);
          }, 66);
        }}
      />
      <canvas ref={canvasRef} style={{ display: 'none' }} />
      {loading && <p>Loading model…</p>}
      {error && <p>Error: {error.message}</p>}
      {result && (
        <div className="mt-4">
          <p>Status: {result.glasses ? 'Glasses Detected' : 'No Glasses'}</p>
          <p>Confidence: {(result.probability * 100).toFixed(1)}%</p>
        </div>
      )}
      <button onClick={reset}>Reset</button>
    </div>
  );
}

Engineered for the edge

Optimization metrics from production workloads.

Model Size
6.2MB
Cached after first load
Inference Time
<100ms
On standard hardware
Memory Footprint
~45MB
VRAM allocation
Browser Support
98%
WASM / WebGL / WebGPU

Architecture

User Media / Image
FrameFind Core
MediaPipe FaceMesh (WASM)
ROI Tensor Extraction
ONNX WebGPU Inference
boolean: hasGlasses

Future Roadmap

  • Glasses detection
  • Mask detection
  • Blink / drowsiness detection
  • Head pose estimation
  • Attention tracking
  • Iris gaze estimation

Philosophy

Edge-first AI. The future of computer vision shouldn't be gated by API latency, network conditions, or cloud compute costs. By running targeted, lightweight models directly on the client, we unlock interfaces that were previously impossible.

Privacy by design. When an application requires monitoring user attention, eye contact, or facial context, sending that media feed to a remote server is fundamentally hostile to privacy. On-device inference guarantees that pixels never leave the user's hardware.

Pragmatic ML. Instead of loading a 100MB general-purpose vision model, FrameFind uses a tiered approach: an ultra-optimized landmark detector securely crops regions of interest, feeding them into tiny, specialized classification models.

API Reference

Complete reference for all public APIs across the three FrameFind packages.

@framefind/core

Glasses detection core β€” browser and Node.js
Browser Β· GlassesDetector
new GlassesDetector(options)class

Options: modelUrl, wasmPaths, threshold (default 0.35), smoothingWindow (default 8).

detector.load()method

Initialize ONNX model. Await before calling detect methods.

detector.detectFromImageData(pixels, w, h, landmarks?)method

Run inference on raw RGBA pixel buffer.

detector.detectFromCanvas(canvas, landmarks?)method

Run inference on an HTMLCanvasElement.

detector.detectFromVideoFrame(video, offscreenCanvas, landmarks?)method

Run inference on a live video frame.

detector.resetHistory()method

Clear smoothing window history.

detector.dispose()method

Release ONNX session and free memory.

Node.js Β· GlassesDetectorNode
new GlassesDetectorNode(options)class

Options: modelPath (local path), threshold, smoothingWindow.

detector.load()method

Load model via onnxruntime-node.

detector.detectFromRgbaBuffer(pixels, faceDetected?)method

Inference from raw RGBA Buffer.

detector.detectFromImagePath(path)method

Load image from disk and run inference.

detector.resetHistory()method

Clear smoothing window history.

detector.dispose()method

Release model resources.

Return type
DetectionResult
type DetectionResult = {
  glasses: boolean;      // true if glasses detected
  probability: number;   // smoothed confidence 0–1
  faceDetected: boolean; // true if landmarks provided
};
DetectorConfig
type DetectorConfig = {
  modelUrl: string;
  threshold?: number;       // default: 0.35
  smoothingWindow?: number; // default: 8
};

@framefind/react

React hooks for FrameFind glasses detection
Hooks
useGlassesDetector(options)hook

Loads detector on mount, disposes on unmount. Returns result, loading, error, detect(), reset().

options.modelUrloption

Required. URL to .onnx model file.

options.wasmPathsoption

Folder path for onnxruntime .wasm files.

options.thresholdoption

Binary cutoff. Default 0.35.

options.smoothingWindowoption

Frames averaged. Default 8.

options.enabledoption

Pause detection without unmounting.

Hook return shape
useGlassesDetector
const {
  result,    // DetectionResult | null
  loading,   // boolean
  error,     // Error | null
  detect,    // (video, canvas, landmarks?) => void
  reset,     // () => void
} = useGlassesDetector({ modelUrl: '/model.onnx' });

@framefind/utils

Shared types, constants, and helpers
Types
DetectionResulttype

{ glasses: boolean; probability: number; faceDetected: boolean }

DetectorConfigtype

{ modelUrl: string; threshold?: number; smoothingWindow?: number }

Constants
GLASSES_MODEL_URLconst

Official CDN URL for the ONNX model.

DEFAULT_THRESHOLDconst

0.35 β€” binary decision cutoff.

DEFAULT_SMOOTH_Nconst

8 β€” frames averaged for smoothing.

ROI_SIZEconst

112 β€” crop size the model expects (px).

EYE_REGION_IDXconst

MediaPipe landmark indices covering the eye region.

MEAN / STDconst

[0.485, 0.456, 0.406] / [0.229, 0.224, 0.225] β€” ImageNet normalization.

Functions
sigmoid(x: number): numberfn

Converts raw model logit to probability.

smoothAverage(history: number[]): numberfn

Averages prediction history for temporal smoothing.

imageDataToChwFloat32(pixels, width, height): Float32Arrayfn

RGBA buffer β†’ CHW Float32 tensor normalized with MEAN/STD.