Question : Have you tried gemma 4 - 31b ? And can I use the latest llama cpp with this project ?

Nice works, ini this address, someone said that just using --mmap it can run directly big model directly with llama cpp (i tried but failed)

https://thoughts.jock.pl/p/local-llm-35b-mac-mini-gemma-swap-production-2026

Thats why your project seems intereseting

Have you tried gemma 4 - 31b or qwen 3.5 35b a3b ? And can I use the latest llama cpp with this project ?

Thanks