shimmy-private/ ← THIS REPO — public-facing CLI/server product (private working copy)
public remote: shimmy.git ← https://github.com/Michael-A-Kuykendall/shimmy.git
airframe = { version = "0.1" } ← PUBLIC crates.io dep — no path dep, no cloning needed
- Shimmy is the product. Airframe is the public GPU engine (crates.io: https://crates.io/crates/airframe).
- Both are MIT-licensed. Both are public. No private dependencies.
cargo build(default features) compiles the full GPU engine — airframe is downloaded from crates.io.
- Two remotes exist:
origin→https://github.com/Michael-A-Kuykendall/shimmy-private.git(private working copy)public→https://github.com/Michael-A-Kuykendall/shimmy.git(public GitHub repo users see)
- In the submodule context (
shimmy_integration/inside airframe workspace): push withgit push private <branch>. - In the standalone context (
C:/Users/micha/repos/shimmy-private): push withgit push origin <branch>(private) orgit push public <branch>(public). - Do not push unless explicitly requested by the user.
- To publish to the public shimmy repo, push to the
publicremote.
Zero tolerance. No exceptions.
cargo test must finish with 0 failures before any task is considered done.
There is no such thing as a "pre-existing" failure. Fix it before moving on.
- Engine: wgpu/WebGPU WGSL pipeline via Airframe (crates.io:
airframe = "0.1"). Replaces llama.cpp entirely. - Server: OpenAI-compatible (
/v1/chat/completions,/v1/completions), Ollama-compat (/api/generate,/api/tags), LM Studio discovery. - No Python in default path. Default build is
airframe+huggingfacefeatures. - WGSL quant coverage: F32, F16, Q4_0, Q8_0, Q4_K(M/S), Q5_K(M/S), Q6_K.
- wgpu 2 GB buffer cap: Known limit for models with tensors >2 GB. Deferred to v2.1.
default = ["airframe", "huggingface"] # Full GPU build; use --no-default-features --features huggingface for CPU-only
airframe = ["dep:airframe"] # Airframe native GPU engine (from crates.io)
gpu = ["airframe", "huggingface"] # GPU-optimized build
full = ["airframe", "huggingface", "mlx"]
fast / coverage = ["huggingface"] # CI-safe, no GPU hardware required
# Deprecated stubs (llama.cpp removed in v2.0):
llama = [] llama-cuda = [] llama-vulkan = [] llama-opencl = []- Console (
crates/console/) is scaffolded but unimplemented. Keep isolated from runtime release changes. - Vision work is deferred. Keep on dedicated branches.
- Launch scope is architecture/runtime path only.
- Do NOT add an
airframe/submodule inside this repo. - Do NOT use a path dep for airframe — it is on crates.io as
airframe = { version = "0.1", optional = true }. - Do NOT push without explicit user request.
- Do NOT mix vision or console feature work into launch-critical runtime changes.