Description
Match Olmo v2 SFT on core evals (MMLU) and instruction following (alpacaeval)
Hypothesis or Goal
Trying to match the perf of Olmo v2 SFT given this dataset
Links
- WandB Report: (link)
- Data Browser: (link)
- Experiment JSON: (link)
- (etc.)
Results
(What did you find, including relevant evaluation metrics, etc.)
Description
Match Olmo v2 SFT on core evals (MMLU) and instruction following (alpacaeval)
Hypothesis or Goal
Trying to match the perf of Olmo v2 SFT given this dataset
Links
Results
(What did you find, including relevant evaluation metrics, etc.)