simcap

Revisiting SIMCAP

Executive Summary

SIMCAP (Sensor Inferred MOtion CAPture) represents a far richer concept than “gesture recognizer but nerdier.” This document reframes the project’s theoretical potential and contrasts it with the current implementation to identify the path forward.


Theoretical Foundation: What SIMCAP Could Be

The Complete Vision

The full SIMCAP concept envisions a sophisticated hand motion capture system comprising:

Hardware Components:

Intended Capabilities:

Information Theory Analysis

The theoretical information available from the sensor suite includes:

  1. Palm Orientation
    • Complete 3D orientation (quaternion/rotation matrix) over time
    • Reconstructable palm pointing direction
    • Rotation, swing, and motion dynamics
  2. Relative Finger Configuration
    • Magnetic field distortion patterns encoding:
      • Which fingers are flexed/extended
      • Proximity to palm
      • Relative pose configurations
    • Non-linear field combination requiring learned models rather than closed-form solutions
  3. Temporal Structure
    • Time-series data enabling:
      • Dynamic gesture recognition (pinch → swipe → release)
      • Motion dynamics (fast flick vs. slow sweep)
      • Sequence pattern classification

The elegance lies in the architecture: one “smart” node (palm sensor) + five “dumb” nodes (passive magnets).


Current Implementation: Where SIMCAP Stands Today

Hardware Reality

Current Platform: Puck.js (Espruino-based wearable)

Missing from Vision:

Software Reality

GAMBIT Firmware (Current Device Code):

Location: src/device/GAMBIT/app.js
Functionality:
- Raw telemetry collection at 50Hz for 30-second bursts
- Button-triggered data capture
- BLE advertising and console output
- NFC-triggered web UI launch

Current Capabilities:

Missing Capabilities:

Web UI (GAMBIT):

Location: src/web/GAMBIT/
Purpose: Baseline data collection and visualization
Features:
- WebBLE connectivity
- Kalman filtering (kalman.js present)
- Real-time data display
- NFC-triggered launch

Status: Data collection infrastructure exists, but no processing pipeline.

Other Components:


The Gap: Vision vs. Reality

Architectural Gaps

Theoretical Vision Current Implementation Gap Analysis
Finger-mounted magnets as passive markers No magnets mentioned Critical hardware gap
Magnetic field distortion analysis Raw magnetometer data only No analysis pipeline
Multi-tier ML pipeline (poses → gestures → full tracking) No ML implementation Complete ML gap
On-device TinyML inference Raw data streaming only No embedded ML
User-specific calibration No calibration system Missing calibration
Environment baseline adaptation Static data collection No adaptation
Palm-centric coordinate transformation Raw sensor coordinates No coordinate normalization
Chorded input / gesture vocabulary No gesture system No high-level interface

What Works

The current implementation provides a solid foundation:

What Doesn’t Exist Yet

The vision’s core differentiators remain unbuilt:


Proposed Roadmap: Closing the Gap

Tier 1 – Static Finger Pose Classifier

Goal: Treat the system as a chorded keyboard made of fingers.

Prerequisites:

Implementation:

Success Criteria:

Current Project Status: Infrastructure exists for data collection; need magnet hardware and ML pipeline.

Tier 2 – Dynamic Gesture Recognition

Goal: Add temporal dynamics to pose recognition.

Implementation:

Success Criteria:

Current Project Status: No gesture infrastructure; requires Tier 1 foundation.

Tier 3 – Approximate Hand Pose Estimation

Goal: Probabilistic skeletal hand tracking.

Implementation:

Success Criteria:

Current Project Status: Requires Tiers 1-2; represents long-term vision.


Design Considerations & Known Challenges

Challenges Identified in Vision

Magnetic Interference & Drift:

User-Specific Calibration:

Coordinate Frame Consistency:

Sampling & Windowing:

Development Workflow:

Current Implementation Considerations

What GAMBIT Does Well:

What Needs Refinement:


Architecture Sketch: Concrete ML Pipeline

Data Format (Per Timestep t)

acc[t]  – 3 floats (ax, ay, az)
gyro[t] – 3 floats (gx, gy, gz)
mag[t]  – 3 floats (mx, my, mz)
Optional: derived features (magnitude, filtered variants)

Window Structure

Shape: (T, F) where T = timesteps (e.g., 32), F = features (9+ dims)

Model Architecture (Classification)

Input: T × F tensor
  ↓
1D Conv (temporal) + ReLU
  ↓
1D Conv + ReLU
  ↓
Global average pooling over time
  ↓
Dense → Dense → Softmax over N gestures

Model Architecture (Regression to Hand Pose)

Same backbone
  ↓
Linear output layer → pose vector (e.g., 10 dims for 5 fingers)

Model Size: Small enough to:

Current Implementation: No model exists; GAMBIT provides raw data streams only.


Why SIMCAP Matters: Conceptual Significance

The Core Question

“How much structural information about a complex body (the hand) can be inferred from a tiny, noisy, indirect sensor + some magnets?”

This represents a fascinating exploration of:

The Broader Vision

SIMCAP embodies a “state machine with latent context” paradigm—a physically-embedded version of inferring complex state from minimal observations. The glove becomes a one-sensor state machine for hand dynamics.

Potential Applications

  1. Chorded Input Device
    • IDE control via finger combinations
    • Wearable macro keyboard
    • Accessibility interface
  2. Spatial Computing Controller
    • Memory palace navigation
    • 3D environment manipulation
    • XR interaction paradigm
  3. Cognitive Tool
    • Physical “algorithm sketching”
    • Gesture-based structure manipulation
    • Embodied computation interface

Current Status Assessment

What Exists Today (Strengths)

The SIMCAP project has established:

What’s Missing (Gaps)

The vision requires:

The Path Forward

Immediate Next Steps (Minimum Viable Extension):

  1. Add finger magnets (hardware modification)
  2. Implement palm-centric coordinate transformation
  3. Build data collection + labeling UI for poses
  4. Train simple static pose classifier offline
  5. Deploy to device or laptop for real-time demo

Medium-Term Goals:

Long-Term Vision:


Conclusion: The Project Isn’t Dead

SIMCAP has been in “a really long compile cycle,” not abandoned. The foundation exists; the vision remains compelling; the path is clear.

The current implementation (GAMBIT) provides reliable sensor data streaming—a necessary but insufficient component. The gap between vision and reality is significant but bridgeable through systematic execution of the proposed tier-based roadmap.

The project sits at a critical juncture: the infrastructure works; the intelligence layer awaits construction.

With finger magnets, coordinate normalization, and a basic ML pipeline, SIMCAP could rapidly evolve from “interesting data collector” to “functional motion inference system.”


References