Skip to content

[ROCm] Optimize kgemm_4bit_inference_naive for ROCm, use it for batch sizes other than 1 #255

[ROCm] Optimize kgemm_4bit_inference_naive for ROCm, use it for batch sizes other than 1

[ROCm] Optimize kgemm_4bit_inference_naive for ROCm, use it for batch sizes other than 1 #255

CUDA (linux-x64, L40S, 12.8.1)  /  test

succeeded Apr 22, 2026 in 6m 24s