🚀 Offline AI & ML Playground

⚠️ Important: iOS Simulator Limitation

This app requires a physical iOS device for testing MLX Swift functionality.

MLX Swift does not support iOS simulators due to GPU/Metal framework limitations. Simulators cannot emulate the hardware-accelerated GPU features that MLX requires for AI model inference.

For Development:

✅ Physical iOS Device - Full MLX Swift functionality works perfectly
❌ iOS Simulator - Will crash when loading AI models due to MLX limitations
🧪 Testing - Use physical devices with Apple Silicon for MLX-related features

This is a known limitation of the MLX Swift framework, not a bug in this application. All core functionality works flawlessly on real hardware.

🧑‍💻 The Ultimate Playground to Compare and Choose Your AI Model (iOS Only)

Easily compare, test, and evaluate a wide range of open-source AI models locally on your iPhone to help you choose the best model for your own project.

A production-ready on-device AI playground for iOS that runs real open-source LLMs locally using MLX Swift. Chat with AI models completely offline with zero network dependency after download.

📱 App in Action

Chat with local AI models on your iPhone - completely offline!

Features shown: Chat interface • Model switching • Real-time responses • Native iOS design

⚡ Powered by MLX Swift

This app leverages Apple's MLX Swift framework for high-performance, on-device machine learning inference. Experience the power of local AI with Apple Silicon optimization.

🎯 Core Features

🤖 Real AI Chat with MLX Swift

✅ Production-grade AI inference using MLX Swift
✅ Streaming text generation - Watch responses appear word-by-word
✅ Multiple model support - Llama, Mistral, Code (DeepSeek, StarCoder, CodeLlama), and more
✅ Zero network dependency - Chat completely offline
✅ Apple Silicon optimized - Blazing fast performance

💾 Smart Local Caching System

✅ Intelligent file system caching - Models load from disk, not internet
✅ Automatic download management - Download once, use forever
✅ Storage optimization - Efficient model storage and retrieval
✅ Download progress tracking - Real-time download status
✅ Model verification - Ensures model integrity and availability

🔧 Advanced Model Management

✅ MLX-optimized model loading - Fast startup and inference
✅ Memory efficient processing - Proper cleanup and optimization
✅ Model format support - GGUF, SafeTensors, MLX native formats
✅ Dynamic model switching - Change models without restart
✅ Comprehensive logging - Track every step of model operations

🎨 Native Apple Experience

✅ SwiftUI throughout - Modern, responsive interface
✅ iOS compatibility - Optimized for iPhone and iPad
✅ Real-time UI updates - Smooth streaming text display
✅ Native performance - No web views or hybrid solutions

⚡ Performance Features

🚄 MLX Swift Optimizations

Apple Silicon acceleration - Native Metal performance
Memory efficient inference - Smart memory management
Streaming generation - Real-time text streaming
Background processing - Non-blocking UI operations
Automatic cleanup - Prevents memory leaks

💽 Smart Caching System

Local-first loading - Check disk before downloading
Integrity verification - Ensure model file consistency
Automatic synchronization - Sync download tracking with files
Efficient storage - Organized model directory structure
Graceful fallbacks - Download if local files missing

🚀 Getting Started

Prerequisites

iOS 15.0+
Apple Silicon recommended (Intel supported)
Xcode 15.0+
2GB+ free storage for models

Quick Start

Clone & Open - Open Offline AI&ML Playground.xcodeproj
Build - Project builds cleanly with all MLX dependencies
Download Models - Use Download tab to get AI models locally
Start Chatting - Chat with real AI models completely offline!

🤖 Available Chat Models

Curated MLX-Compatible Models for iPhone (2025)

The app features a carefully selected collection of chat models optimized for iPhone performance:

Static Model List: No dynamic loading - all models are pre-configured
Chat-Focused: Currently supporting conversational AI models only
iPhone-Optimized: All models tested for iPhone memory and performance
MLX-Compatible: Leveraging MLX Swift for hardware acceleration

Current Chat Model Catalog

🐤 Ultra-Tiny Models (100MB - 300MB) - NEW!

Model	Size	Parameters	Description
SmolLM 135M	135MB	135M	🏆 Smallest! Perfect for quick testing
Pythia 160M	160MB	160M	EleutherAI's research model
OPT 125M	250MB	125M	Meta's tiny transformer
SmolLM 360M	290MB	360M	Better SmolLM variant

🍎 Apple Models (OpenELM) - NEW!

Model	Size	Parameters	Description
OpenELM 270M	270MB	270M	Apple's smallest, optimized for Apple Silicon
OpenELM 450M	450MB	450M	Balanced size and performance
OpenELM 1.1B	1.1GB	1.1B	Excellent performance from Apple
OpenELM 3B	3.0GB	3B	Apple's premium chat model

🦙 Meta/Llama Models

Model	Size	Parameters	Description
Llama 3.2 1B	650MB	1B	Ultra-lightweight, perfect for basic conversations
Llama 3.2 3B	1.8GB	3B	⭐ Recommended - Best balance of size and capability
TinyLlama 1.1B	1.1GB	1.1B	Community favorite, fast and efficient

🌟 Mistral Models

Model	Size	Parameters	Description
Mistral 7B Instruct	3.8GB	7B	High-quality conversations for newer iPhones
Mistral Small	2.5GB	-	Compact mobile-optimized variant

🔷 Microsoft Phi Models

Model	Size	Parameters	Description
Phi 3.5 Mini	2.0GB	3.5B	Latest from Microsoft, 4-bit quantized
Phi-2	2.7GB	2.7B	Proven conversational abilities

🔴 Google Models

Model	Size	Parameters	Description
Gemma 2B	2.5GB	2B	Efficient on-device conversations

🌏 Qwen Models (Multilingual)

Model	Size	Parameters	Description
Qwen 2.5 1.5B	1.6GB	1.5B	Strong multilingual support
Qwen 2.5 3B	3.2GB	3B	Larger variant with better performance

🤖 OpenAI Models

Model	Size	Parameters	Description
GPT-2 Medium	380MB	355M	Better quality, still lightweight
GPT-2	548MB	124M	Classic lightweight model

🎨 Stability AI Models

Model	Size	Parameters	Description
StableLM 2 1.6B	1.7GB	1.6B	Dedicated chat model

Why These Models?

✅ Ultra-Tiny Options - Models from just 135MB for quick testing
✅ Apple Silicon Native - Includes Apple's own OpenELM models
✅ Memory Efficient - All models under 4GB for iPhone compatibility
✅ 4-bit Quantization - Many models support 4-bit for reduced memory
✅ MLX Optimized - All tested with MLX Swift for best performance
✅ Diverse Selection - 21 models from 135MB to 3.8GB
✅ Quality Conversations - Every model chosen for chat capabilities

Why This Approach Works

✅ No Authentication - All repositories are publicly accessible
✅ MLX Compatible - MLX Swift handles format conversion automatically
✅ Single Downloads - No need for complex multi-file repository downloads
✅ Consistent Loading - Same ModelConfiguration pattern for all models
✅ iPhone Optimized - All models selected for mobile deployment feasibility

🎮 Usage Examples & Architecture Flow

DOWNLOAD WORKFLOW

// 1. User clicks download button for "GPT-2" model
SharedModelManager.downloadModel(gpt2Model)

// 2. System constructs public repository URL
"https://huggingface.co/openai-community/gpt2/resolve/main/model.safetensors"

// 3. Downloads single file to local directory
"/Documents/Models/gpt2" (351MB file)

// 4. Updates tracking system
downloadedModels.insert("gpt2")

INFERENCE WORKFLOW

// 1. User selects GPT-2 for chat
AIInferenceManager.loadModel(gpt2Model)

// 2. Creates ModelConfiguration using repository ID
ModelConfiguration(id: "openai-community/gpt2")

// 3. MLX Swift Hub integration handles conversion
// - Checks local file: /Documents/Models/gpt2 
// - Auto-converts to MLX format as needed
// - Loads model container for inference

// 4. Real-time text generation
aiInferenceManager.generateText(prompt: "Hello!")
// Result: "Hello! How can I help you today?"

STATE MANAGEMENT (CRITICAL FIX)

// PROBLEM: This caused "Publishing changes from within view updates"
FileManager.default.fileExists(atPath: modelPath) // ON MAIN THREAD ❌

// SOLUTION: Background file checks with main thread updates
DispatchQueue.global(qos: .userInitiated).async { [weak self] in
    let fileExists = FileManager.default.fileExists(atPath: modelPath)
    DispatchQueue.main.async { [weak self] in
        self?.downloadedModels.insert(modelId) // SAFE ✅
    }
}

ERROR HANDLING PATTERNS

// AUTHENTICATION ERRORS (Solved)
// OLD: "mlx-community/gpt2-4bit" → HTTP 401 "Invalid username or password"
// NEW: "openai-community/gpt2" → HTTP 302 (Public access) ✅

// MISSING FILE ERRORS (Solved)  
// OLD: Looking for "config.json" in MLX community repo structure
// NEW: MLX Swift auto-handles missing config files during conversion ✅

// STATE UPDATE ERRORS (Solved)
// OLD: Direct @Published updates during SwiftUI view updates
// NEW: Deferred updates via DispatchQueue.main.async ✅

📱 Platform Support

Platform	Status	Performance
📱 iOS	✅ Full Support	⚡ Great (A-series chips)

🎯 Current Status

✅ Fully Working Features

🤖 MLX Swift AI Inference - Production ready
💬 Streaming Chat Interface - Smooth word-by-word generation
📥 Local Model Caching - Intelligent file system management
🔄 Model Download System - Progress tracking & verification
🧠 Multi-model Support - Llama, Mistral, Code models (DeepSeek, StarCoder, CodeLlama), General models
📊 Comprehensive Logging - Track every operation
🧪 Testing Framework - Verify MLX functionality
🔧 Memory Management - Efficient cleanup & optimization

🚀 Performance Verified

⚡ Fast inference with Apple Silicon optimization
💾 Smart caching prevents redundant downloads
🌊 Smooth streaming with real-time UI updates
🧹 Clean memory usage with proper disposal

🎯 Why This Implementation Rocks

🚀 Real AI, Not Simulated — Uses actual MLX Swift for inference
⚡ Blazing Fast — Apple Silicon optimized performance
💾 Smart Caching — Download once, use forever
🔒 Privacy First — Everything happens on-device
🛠️ Production Ready — Comprehensive error handling & logging
🧪 Well Tested — Extensive test coverage for reliability

Experience the future of on-device AI with MLX Swift. 🚀🧠✨

Built with ❤️ using Apple's MLX Swift framework for the ultimate local AI experience.

Name		Name	Last commit message	Last commit date
Latest commit History 86 Commits
.github/workflows		.github/workflows
.kiro		.kiro
.vscode		.vscode
BuildTools		BuildTools
Offline AI&ML Playground.xcodeproj		Offline AI&ML Playground.xcodeproj
Offline AI&ML Playground		Offline AI&ML Playground
Offline AI&ML PlaygroundTests		Offline AI&ML PlaygroundTests
Offline AI&ML PlaygroundUITests		Offline AI&ML PlaygroundUITests
Scripts		Scripts
memory-bank		memory-bank
.gitattributes		.gitattributes
.gitignore		.gitignore
.swiftlint.yml		.swiftlint.yml
AGENTS.md		AGENTS.md
CHAT_TEMPLATE_GUIDE.md		CHAT_TEMPLATE_GUIDE.md
CLAUDE.md		CLAUDE.md
MEMORY_CONTEXT_FIX.md		MEMORY_CONTEXT_FIX.md
MEMORY_FEATURE_IMPLEMENTATION.md		MEMORY_FEATURE_IMPLEMENTATION.md
MLX_DIRECTORY_STRUCTURE_FIX.md		MLX_DIRECTORY_STRUCTURE_FIX.md
MLX_DOWNLOAD_FIX.md		MLX_DOWNLOAD_FIX.md
MLX_MODELS_IMPLEMENTATION.md		MLX_MODELS_IMPLEMENTATION.md
MLX_MODEL_DIRECTORY_FIX.md		MLX_MODEL_DIRECTORY_FIX.md
MLX_MODEL_FORMAT_GUIDE.md		MLX_MODEL_FORMAT_GUIDE.md
MODEL_ARCHITECTURE_FLOW.md		MODEL_ARCHITECTURE_FLOW.md
README.md		README.md
SOLID-DRY-Guidelines.md		SOLID-DRY-Guidelines.md
SWIFT_TRANSFORMERS_INTEGRATION.md		SWIFT_TRANSFORMERS_INTEGRATION.md
TOKEN_METRICS_IMPLEMENTATION.md		TOKEN_METRICS_IMPLEMENTATION.md
TOKEN_METRICS_TESTS.md		TOKEN_METRICS_TESTS.md
TRANSFORMERS_SETUP.md		TRANSFORMERS_SETUP.md
chat-example.jpeg		chat-example.jpeg

Folders and files

Latest commit

History

Repository files navigation

🚀 Offline AI & ML Playground

⚠️ Important: iOS Simulator Limitation

🧑‍💻 The Ultimate Playground to Compare and Choose Your AI Model (iOS Only)

A production-ready on-device AI playground for iOS that runs real open-source LLMs locally using MLX Swift. Chat with AI models completely offline with zero network dependency after download.

📱 App in Action

⚡ Powered by MLX Swift

🎯 Core Features

🤖 Real AI Chat with MLX Swift

💾 Smart Local Caching System

🔧 Advanced Model Management

🎨 Native Apple Experience

⚡ Performance Features

🚄 MLX Swift Optimizations

💽 Smart Caching System

🚀 Getting Started

Prerequisites

Quick Start

🤖 Available Chat Models

Curated MLX-Compatible Models for iPhone (2025)

Current Chat Model Catalog

🐤 Ultra-Tiny Models (100MB - 300MB) - NEW!

🍎 Apple Models (OpenELM) - NEW!

🦙 Meta/Llama Models

🌟 Mistral Models

🔷 Microsoft Phi Models

🔴 Google Models

🌏 Qwen Models (Multilingual)

🤖 OpenAI Models

🎨 Stability AI Models

Why These Models?

Why This Approach Works

🎮 Usage Examples & Architecture Flow

DOWNLOAD WORKFLOW

INFERENCE WORKFLOW

STATE MANAGEMENT (CRITICAL FIX)

ERROR HANDLING PATTERNS

📱 Platform Support

🎯 Current Status

✅ Fully Working Features

🚀 Performance Verified

🎯 Why This Implementation Rocks

Experience the future of on-device AI with MLX Swift. 🚀🧠✨

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages