Find AI models you can run in the browser, on mobile, or at the edge — no server required. Focused on sustainability, privacy, and simplicity.
- Edge First: Recommend models that run on the user's device — no data center needed
- Privacy First: Runs locally in browser, no external data transmission
- Environmental Consciousness: "Smaller is better" — on-device inference eliminates server-side compute entirely
- Simplicity First: Prove concept before adding complexity
- Open Source: Community-driven development
- ✅ SvelteKit project structure with static site generation
- ✅ Browser-based embedding classification (MiniLM, 98.3% accuracy)
- ✅ Tiered recommendation engine (Lightweight/Standard/Advanced)
- ✅ Environmental impact scoring (1-3 scale)
- ✅ Model accuracy filtering (50-95% threshold)
- ✅ Lightweight model (~23MB) enables mobile support
- ✅ Automated model dataset updates (daily via GitHub Actions)
- ✅ PWA with offline capability (installable on desktop & mobile)
- ✅ Full accessibility support
- ✅ 152 curated models across 4 categories, 10 subcategories
Live: https://ismaelmartinez.github.io/ai-model-advisor
Goals: PWA with tiered recommendations and environmental scoring
Achieved:
- <1 second to recommendations (target was <30s)
- 4 AI task categories, 10 subcategories, 98.3% classification accuracy
- ~40KB bundle size (target was <2MB)
- Full accessibility support
- Mobile & desktop PWA support
- Daily automated model updates from Hugging Face
Goal: Sharpen the project's identity around edge-deployable models. Reposition from "general AI model advisor" to "find AI you can run without a server."
Why now: The edge AI ecosystem has matured — Transformers.js v4 supports ~200 architectures, WebGPU is broadly available, and small models (1-4B) are closing the quality gap. No tool answers: "I want to do [task X] without a cloud API — what are my options?" See edge-browser-ai-direction.md for full analysis.
Phase 2 is organized into three parallel workstreams plus one independent optimization:
Can run independently — no UI or build changes needed
- Tag models with verified edge runtime support — Add
runtimemetadata (framework, backend, tested/untested) to the top 40-50 edge-compatible models already in the database - Add task maturity labels — Classify each subcategory as "excellent", "good", "limited", or "emerging" for edge deployment (see research doc Appendix tables)
- Curate for deployability — Surface edge-compatible models first; de-emphasize models requiring dedicated GPU servers (keep them accessible, just deprioritized)
- Expand edge model coverage — Add verified browser-runnable models from Transformers.js, TensorFlow.js, MediaPipe, and WebLLM ecosystems
Can run in parallel with Workstream A
- Update positioning and copy — Hero text, descriptions, and environmental messaging to reflect edge/browser focus
- "Edge-Ready" badge — Visual indicator on models with verified browser/mobile support
- Structural environmental story — Frame on-device AI as an architecture choice: no data center, no network round-trip, amortized hardware, privacy bonus
- Honest task maturity display — Show users which tasks are well-supported at the edge vs. experimental
Fully independent — can run in parallel with A and B
- Pre-compute reference embeddings — Distill the ~320 reference example embeddings at build time into a static JSON file, so the browser skips embedding them on first load. Cuts initial classification latency significantly.
Can start in parallel, but benefits from A completing first
- Cross-browser testing — Validate Firefox, Safari, Edge support
- Fix any edge cases — Address test failures found during broader browser testing
- Improve test coverage — Add tests for new runtime metadata and edge-ready flags
Parallelism map:
┌─── A: Data & Metadata ───────────────┐
│ │
Start ───┼─── B: UX & Messaging ────────────────┼──→ Phase 2 Complete
│ │
├─── C: Build-Time Embeddings ──────────┤
│ │
└─── D: Cross-Browser (start early, ────┘
benefits from A finishing)
Goal: Turn recommendations into starting points for implementation.
Depends on Phase 2 data work being complete.
- Code snippets — "Get started" snippets for Transformers.js, ONNX Runtime Web, TensorFlow.js for each recommended model
- Runtime comparison — Side-by-side framework comparison showing which runtimes support each model
- Performance estimates — Cold start time, inference latency, memory footprint from community benchmarks and our own testing
- "Can I Run This?" feature — Input a model name or Hugging Face URL → get an edge deployment feasibility report (estimated browser memory, compatible runtimes, quantization options, WebGPU vs WASM expectations)
Goal: Build the dataset that makes this tool indispensable.
- Community-reported benchmarks — Users submit real-world performance data for models they've tested
- Automated compatibility testing — CI job that tests top models in headless Chrome to verify browser runnability
- Framework update tracking — Monitor Transformers.js, ONNX Runtime Web releases for new model support
- Expanded model coverage — Systematically test and add models from HF Hub that have ONNX/WASM exports
User Input → Embedding Classification (MiniLM) → Model Selection → Environmental Scoring → Recommendations
Fallback Chain: Embedding Similarity → Semantic Matching → Keyword Matching → Default Category
- Speed: <1 second query to recommendation ✅
- Accuracy: 98.3% classification accuracy ✅
- Performance: ~40KB bundle, <1s load time ✅
- Mobile Ready: ~23MB model, PWA installable ✅
- Edge models with runtime data: 25+ models with verified runtime metadata ✅
- Code snippet coverage: 0% → 80% of recommended models (Phase 3 target)
- Adoption: Target 2,000+ monthly users within 6 months of Phase 2 launch
- Test with real use cases
- Curate model data (especially edge runtime metadata)
- Improve algorithms
- Report browser compatibility findings
- Promote sustainability focus
See CONTRIBUTING.md for details.
- project-status.md - Detailed current status
- docs/ - Technical documentation and ADRs
- docs/research/edge-browser-ai-direction.md - Edge/browser pivot analysis
- CLAUDE.md - AI assistant configuration
Help developers find and deploy AI models that run on their users' devices — eliminating server-side compute, protecting user privacy, and reducing the environmental footprint of AI inference.