We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
This repository is the clone of https://github.dev/eric-mitchell/direct-preference-optimization/blob/main/trainers.py and serves educational purposes. It contains helper functions to use with DPO implementation
There was an error while loading. Please reload this page.