## Description This model is trained from scratch with WSD + EMA, but otherwise starts the same as the other tootsies (#859 #600) Still using DCLM+code+math for now. ## Hypothesis or Goal Just make a real freaking good model. ### Links * [WandB Report](https://wandb.ai/marin-community/marin/reports/Big-Tootsies--VmlldzoxMTEyOTQ0MA) * Experiment JSON:
Description
This model is trained from scratch with WSD + EMA, but otherwise starts the same as the other tootsies (#859 #600)
Still using DCLM+code+math for now.
Hypothesis or Goal
Just make a real freaking good model.
Links