I have a question about the implementation of the NF4 data type:
In the implementation process, I noticed that eight evenly spaced intervals between 0.5 and an offset value of 0.9677083 are created for the negative part, and similarly nine intervals for the positive part. I understand the general approach, but I'm unclear about the rationale behind the specific offset value. Is it chosen arbitrarily, or is there a specific reason for selecting this value? Initially, I thought the offset might be calculated as 2^k/(2^k+1) = 16/17 = 0.9412, based on the paper's guidance, but this seems to differ from the offset value you've used.
Could you please clarify this for me? Thank you very much!
I have a question about the implementation of the NF4 data type:
In the implementation process, I noticed that eight evenly spaced intervals between 0.5 and an offset value of 0.9677083 are created for the negative part, and similarly nine intervals for the positive part. I understand the general approach, but I'm unclear about the rationale behind the specific offset value. Is it chosen arbitrarily, or is there a specific reason for selecting this value? Initially, I thought the offset might be calculated as 2^k/(2^k+1) = 16/17 = 0.9412, based on the paper's guidance, but this seems to differ from the offset value you've used.
Could you please clarify this for me? Thank you very much!