Skip to content

Scaling laws to predict tootsie performance #654

@dlwh

Description

@dlwh

Description

Use the framework we're creating in #646 to predict performance of the tootsie run, mostly as a PoC.

Hypothesis or Goal

Verify that we can predict the performance of our 8b model on a variety of metrics from smaller runs using WSD-S

Metrics of interest:

  • c4_en/bpb
  • lm_eval/*/acc_norm
  • lm_eval/*/bpb (when we add it)
  • lm_eval/*/forced_choice_bpb (when we add it)

Links

(Delete any that aren't applicable)

Results

(What did you find, including relevant evaluation metrics, etc.)

Metadata

Metadata

Assignees

Labels

experimentneeds-discussionThis issue needs to be discussed (scope, priorities, whether to close, etc.).p1Do right now

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions