Hi! I wonder whether unsloth will support some kind of CPU offload?
For example, I would like to finetune a 7-8B model on 24GB gpu. Since LoRA usually results in reduced performance, it would be great if I could do full finetune.
There seems to be some techniques about cpu offloading (e.g. DeepSpeed has some) during, let alone the commonly seen cpu offloading for inferencing. However, searching unsloth's doc does not say things about configuring some cpu offloading.
Thus I wonder, is it because it is impossible or have severe drawback (e.g. will be 100x slower), or just not-yet-implemented / on the plan? Thanks!
Hi! I wonder whether unsloth will support some kind of CPU offload?
For example, I would like to finetune a 7-8B model on 24GB gpu. Since LoRA usually results in reduced performance, it would be great if I could do full finetune.
There seems to be some techniques about cpu offloading (e.g. DeepSpeed has some) during, let alone the commonly seen cpu offloading for inferencing. However, searching unsloth's doc does not say things about configuring some cpu offloading.
Thus I wonder, is it because it is impossible or have severe drawback (e.g. will be 100x slower), or just not-yet-implemented / on the plan? Thanks!