On-device glasses detection for modern web apps.

Real-time face attribute detection using landmarks and a lightweight ONNX model. Runs entirely in the browser or Node.js. Zero backend required.

View Source

6.2MB Model

Sub-100ms Inference

WASM / WebGPU

import { useRef } from 'react';
import { useGlassesDetector } from '@framefind/react';

export function App() {
  const videoRef = useRef<HTMLVideoElement>(null);
  const canvasRef = useRef<HTMLCanvasElement>(null);
  const { result, detect } = useGlassesDetector({
    modelUrl: '/models/glasses.onnx',
  });

  return (
    <div>
      <video ref={videoRef} autoPlay playsInline muted
        onPlay={() => setInterval(() => {
          if (videoRef.current && canvasRef.current)
            detect(videoRef.current, canvasRef.current);
        }, 66)}
      />
      <canvas ref={canvasRef} hidden />
      <p>{result?.glasses ? '🕶 Glasses' : 'No glasses'}</p>
    </div>
  );
}

The problem with existing APIs

Cloud Latency is Too High

Sending 30 frames per second over the network to cloud APIs causes severe lag. Edge-inference guarantees sub-100ms real-time feedback.

Privacy & Compliance

Transmitting user face data to servers creates enormous GDPR/CCPA overhead. On-device inference means data never leaves the browser.

Bloated Dependencies

Traditional ML suites export massive WASM bundles. FrameFind uses targeted MediaPipe landmark indices and a specialized 6.2MB model.

How it works

A lean, optimized pipeline utilizing ONNX Web Runtime.

Input mediaCamera stream or image

FaceMeshExtract landmarks

ROI CropIsolate eye region

ONNX inferenceLocal execution

ResultSmoothed probability

Quick start

FrameFind is modular. Install the core SDK for raw JS environments, or the React bindings for hook-based state management.

Core Engine

bash

npm install @framefind/core onnxruntime-web

React Support

bash

npm install @framefind/react

Node.js Server

bash

npm install onnxruntime-node

import { useRef, useEffect } from 'react';
import { useGlassesDetector } from '@framefind/react';

export default function App() {
  const videoRef = useRef<HTMLVideoElement>(null);
  const canvasRef = useRef<HTMLCanvasElement>(null);

  const { result, loading, error, detect, reset } = useGlassesDetector({
    modelUrl: '/models/glasses.onnx',
    threshold: 0.35,
    smoothingWindow: 8,
  });

  useEffect(() => {
    navigator.mediaDevices.getUserMedia({ video: true }).then((stream) => {
      if (videoRef.current) videoRef.current.srcObject = stream;
    });
  }, []);

  return (
    <div className="p-4">
      <video ref={videoRef} autoPlay playsInline muted
        onPlay={() => {
          setInterval(() => {
            if (videoRef.current && canvasRef.current)
              detect(videoRef.current, canvasRef.current);
          }, 66);
        }}
      />
      <canvas ref={canvasRef} style={{ display: 'none' }} />
      {loading && <p>Loading model…</p>}
      {error && <p>Error: {error.message}</p>}
      {result && (
        <div className="mt-4">
          <p>Status: {result.glasses ? 'Glasses Detected' : 'No Glasses'}</p>
          <p>Confidence: {(result.probability * 100).toFixed(1)}%</p>
        </div>
      )}
      <button onClick={reset}>Reset</button>
    </div>
  );
}

Engineered for the edge

Optimization metrics from production workloads.

Model Size

6.2MB

Cached after first load

Inference Time

<100ms

On standard hardware

Memory Footprint

~45MB

VRAM allocation

Browser Support

98%

WASM / WebGL / WebGPU

Architecture

User Media / Image

FrameFind Core

MediaPipe FaceMesh (WASM)

ROI Tensor Extraction

ONNX WebGPU Inference

boolean: hasGlasses

Future Roadmap

Glasses detection
Mask detection
Blink / drowsiness detection
Head pose estimation
Attention tracking
Iris gaze estimation

Philosophy

Edge-first AI. The future of computer vision shouldn't be gated by API latency, network conditions, or cloud compute costs. By running targeted, lightweight models directly on the client, we unlock interfaces that were previously impossible.

Privacy by design. When an application requires monitoring user attention, eye contact, or facial context, sending that media feed to a remote server is fundamentally hostile to privacy. On-device inference guarantees that pixels never leave the user's hardware.

Pragmatic ML. Instead of loading a 100MB general-purpose vision model, FrameFind uses a tiered approach: an ultra-optimized landmark detector securely crops regions of interest, feeding them into tiny, specialized classification models.

API Reference

Complete reference for all public APIs across the three FrameFind packages.

@framefind/core

Glasses detection core — browser and Node.js

Browser · GlassesDetector

new GlassesDetector(options)class

Options: modelUrl, wasmPaths, threshold (default 0.35), smoothingWindow (default 8).

detector.load()method

Initialize ONNX model. Await before calling detect methods.

detector.detectFromImageData(pixels, w, h, landmarks?)method

Run inference on raw RGBA pixel buffer.

detector.detectFromCanvas(canvas, landmarks?)method

Run inference on an HTMLCanvasElement.

detector.detectFromVideoFrame(video, offscreenCanvas, landmarks?)method

Run inference on a live video frame.

detector.resetHistory()method

Clear smoothing window history.

detector.dispose()method

Release ONNX session and free memory.

Node.js · GlassesDetectorNode

new GlassesDetectorNode(options)class

Options: modelPath (local path), threshold, smoothingWindow.

detector.load()method

Load model via onnxruntime-node.

detector.detectFromRgbaBuffer(pixels, faceDetected?)method

Inference from raw RGBA Buffer.

detector.detectFromImagePath(path)method

Load image from disk and run inference.

detector.resetHistory()method

Clear smoothing window history.

detector.dispose()method

Release model resources.

Return type

DetectionResult

type DetectionResult = {
  glasses: boolean;      // true if glasses detected
  probability: number;   // smoothed confidence 0–1
  faceDetected: boolean; // true if landmarks provided
};

DetectorConfig

type DetectorConfig = {
  modelUrl: string;
  threshold?: number;       // default: 0.35
  smoothingWindow?: number; // default: 8
};

@framefind/react

React hooks for FrameFind glasses detection

Hooks

useGlassesDetector(options)hook

Loads detector on mount, disposes on unmount. Returns result, loading, error, detect(), reset().

options.modelUrloption

Required. URL to .onnx model file.

options.wasmPathsoption

Folder path for onnxruntime .wasm files.

options.thresholdoption

Binary cutoff. Default 0.35.

options.smoothingWindowoption

Frames averaged. Default 8.

options.enabledoption

Pause detection without unmounting.

Hook return shape

useGlassesDetector

const {
  result,    // DetectionResult | null
  loading,   // boolean
  error,     // Error | null
  detect,    // (video, canvas, landmarks?) => void
  reset,     // () => void
} = useGlassesDetector({ modelUrl: '/model.onnx' });

@framefind/utils

Shared types, constants, and helpers

Types

DetectionResulttype

{ glasses: boolean; probability: number; faceDetected: boolean }

DetectorConfigtype

{ modelUrl: string; threshold?: number; smoothingWindow?: number }

Constants

GLASSES_MODEL_URLconst

Official CDN URL for the ONNX model.

DEFAULT_THRESHOLDconst

0.35 — binary decision cutoff.

DEFAULT_SMOOTH_Nconst

8 — frames averaged for smoothing.

ROI_SIZEconst

112 — crop size the model expects (px).

EYE_REGION_IDXconst

MediaPipe landmark indices covering the eye region.

MEAN / STDconst

[0.485, 0.456, 0.406] / [0.229, 0.224, 0.225] — ImageNet normalization.

Functions

sigmoid(x: number): numberfn

Converts raw model logit to probability.

smoothAverage(history: number[]): numberfn

Averages prediction history for temporal smoothing.

imageDataToChwFloat32(pixels, width, height): Float32Arrayfn

RGBA buffer → CHW Float32 tensor normalized with MEAN/STD.