As we know, llama.cpp is a C++ framework for efficient inference of LLaMA and other large language models (LLMs), featuring model quantization to enable AI applications on edge devices with limited resources. I want to run llama.cpp on NEMU with RISC-V 64 instructions. Without OS emulation, how can I run llama.cpp directly on NEMU? Thanks.
As we know, llama.cpp is a C++ framework for efficient inference of LLaMA and other large language models (LLMs), featuring model quantization to enable AI applications on edge devices with limited resources. I want to run llama.cpp on NEMU with RISC-V 64 instructions. Without OS emulation, how can I run llama.cpp directly on NEMU? Thanks.