I attempted to work with the probabilities obtained from predict_proba. For a sanity check, I used argmax and id2label to retrieve the label name, but encountered different results. After some trial and error, I realized that the argmax values were indeed the same. Upon further investigation, I found that the order of the labels used for probabilities is based on the sorted labels, which may differ from the order in id2label. Consequently, in the case of the "negative" and "positive" examples from SST2, they only coincidentally work because "negative" comes before "positive" and is mapped to 0, not 1.
Another point to consider is modifying predict_proba to also return predictions. Although it performs quickly, having it include predictions would eliminate the need to call prediction functions separately.
I appreciate your efforts.
Best regards,
Asaf
I attempted to work with the probabilities obtained from predict_proba. For a sanity check, I used argmax and id2label to retrieve the label name, but encountered different results. After some trial and error, I realized that the argmax values were indeed the same. Upon further investigation, I found that the order of the labels used for probabilities is based on the sorted labels, which may differ from the order in id2label. Consequently, in the case of the "negative" and "positive" examples from SST2, they only coincidentally work because "negative" comes before "positive" and is mapped to 0, not 1.
Another point to consider is modifying predict_proba to also return predictions. Although it performs quickly, having it include predictions would eliminate the need to call prediction functions separately.
I appreciate your efforts.
Best regards,
Asaf