Commit c9ab349
[feat] Model version control using W&B Artifacts (#1137)
Summary:
🚀 I have extended the `WandbLogger` with the ability to log the `current.pt` checkpoint as W&B Artifacts. Note that this PR is based on top of this [PR](#1129).
### What is W&B Artifacts?
> W&B Artifacts was designed to make it effortless to version your datasets and models, regardless of whether you want to store your files with us or whether you already have a bucket you want us to track. Once you've tracked your dataset or model files, W&B will automatically log each and every modification, giving you a complete and auditable history of changes to your files.
Through this PR, W&B Artifacts can help save and organize machine learning models throughout a project's lifecycle. More details in the documentation [here](https://docs.wandb.ai/guides/artifacts/model-versioning).
### Modification
This PR adds a `log_model_checkpoint` method to the `WandbLogger` class in the `utils/logger.py` file. This method is called in the `utils/checkpoint.py` file.
### Usage
To use this, in the `config/defaults.yaml` do, `training.wandb.enabled=true` and `training.wandb.log_checkpoint=true`.
### Result
The screenshot shows the `current.pt` checkpoints saved at intervals defined by `training.checkpoint_interval`. You can check out the logged artifacts page [here](https://wandb.ai/ayut/mmf/artifacts/model/run_ey9xextf_model/0dc64164acbdc300fd01/api).

### Superpowers
With this small addition, now one can easily track different versions of the model, download a checkpoint of interest by using the API in the API tab, easily share the checkpoints with teammates, etc.
### Requests
This is a draft PR as there are a few more things that can be improved here.
* Is there a better way to access the path to the `current.pt` checkpoint? Rather is the modification made to `utils/checkpoint.py` an acceptable way of approaching this?
* While logging a file as W&B artifacts we can also provide metadata associated with that file. In this case, we can add current iteration, training metrics, etc. as the metadata. Would love to get suggestions about the different data points that I should log as metadata alongside the checkpoints.
* How to determine if a checkpoint is the best one? If a checkpoint is best I can add `best` as an alias for that checkpoint's artifact.
Pull Request resolved: #1137
Test Plan:
Imported from GitHub, without a `Test Plan:` line.
**Static Docs Preview: mmf**
|[Full Site](https://our.intern.facebook.com/intern/staticdocs/eph/D32402090/V6/mmf/)|
|**Modified Pages**|
|[docs/notes/logger](https://our.intern.facebook.com/intern/staticdocs/eph/D32402090/V6/mmf/docs/notes/logger/)|
Reviewed By: apsdehal
Differential Revision: D32402090
Pulled By: ebsmothers
fbshipit-source-id: 94b881ec55c4197301331d571bc926521e2feecc1 parent b6a5804 commit c9ab349
File tree
5 files changed
+115
-33
lines changed- mmf
- configs
- trainers/callbacks
- utils
- website/docs/notes
5 files changed
+115
-33
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
45 | 45 | | |
46 | 46 | | |
47 | 47 | | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
48 | 52 | | |
49 | | - | |
| 53 | + | |
50 | 54 | | |
51 | 55 | | |
52 | | - | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
53 | 66 | | |
54 | 67 | | |
55 | 68 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
58 | 58 | | |
59 | 59 | | |
60 | 60 | | |
61 | | - | |
62 | | - | |
63 | | - | |
64 | 61 | | |
65 | | - | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
66 | 65 | | |
67 | 66 | | |
68 | 67 | | |
| |||
153 | 152 | | |
154 | 153 | | |
155 | 154 | | |
| 155 | + | |
156 | 156 | | |
157 | 157 | | |
158 | 158 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
522 | 522 | | |
523 | 523 | | |
524 | 524 | | |
| 525 | + | |
525 | 526 | | |
526 | 527 | | |
527 | 528 | | |
| |||
574 | 575 | | |
575 | 576 | | |
576 | 577 | | |
| 578 | + | |
| 579 | + | |
| 580 | + | |
| 581 | + | |
| 582 | + | |
| 583 | + | |
| 584 | + | |
| 585 | + | |
| 586 | + | |
577 | 587 | | |
578 | 588 | | |
579 | 589 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
225 | 225 | | |
226 | 226 | | |
227 | 227 | | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
228 | 234 | | |
229 | 235 | | |
230 | 236 | | |
| |||
395 | 401 | | |
396 | 402 | | |
397 | 403 | | |
398 | | - | |
399 | | - | |
400 | | - | |
401 | | - | |
| 404 | + | |
| 405 | + | |
| 406 | + | |
402 | 407 | | |
403 | 408 | | |
404 | 409 | | |
405 | 410 | | |
406 | 411 | | |
407 | 412 | | |
408 | 413 | | |
409 | | - | |
410 | | - | |
| 414 | + | |
| 415 | + | |
411 | 416 | | |
412 | | - | |
413 | 417 | | |
414 | 418 | | |
415 | 419 | | |
| |||
420 | 424 | | |
421 | 425 | | |
422 | 426 | | |
423 | | - | |
424 | | - | |
425 | | - | |
426 | | - | |
| 427 | + | |
| 428 | + | |
| 429 | + | |
| 430 | + | |
| 431 | + | |
| 432 | + | |
| 433 | + | |
427 | 434 | | |
428 | 435 | | |
429 | 436 | | |
| |||
453 | 460 | | |
454 | 461 | | |
455 | 462 | | |
456 | | - | |
| 463 | + | |
457 | 464 | | |
458 | 465 | | |
459 | 466 | | |
460 | 467 | | |
461 | | - | |
| 468 | + | |
| 469 | + | |
| 470 | + | |
| 471 | + | |
| 472 | + | |
| 473 | + | |
| 474 | + | |
| 475 | + | |
| 476 | + | |
| 477 | + | |
| 478 | + | |
| 479 | + | |
| 480 | + | |
| 481 | + | |
| 482 | + | |
462 | 483 | | |
463 | 484 | | |
464 | 485 | | |
465 | 486 | | |
466 | | - | |
| 487 | + | |
| 488 | + | |
| 489 | + | |
| 490 | + | |
| 491 | + | |
| 492 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | | - | |
3 | | - | |
4 | | - | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
5 | 5 | | |
6 | 6 | | |
7 | 7 | | |
8 | 8 | | |
9 | | - | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
10 | 12 | | |
11 | 13 | | |
12 | 14 | | |
13 | 15 | | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
14 | 20 | | |
15 | 21 | | |
16 | 22 | | |
| 23 | + | |
| 24 | + | |
17 | 25 | | |
18 | 26 | | |
19 | 27 | | |
20 | 28 | | |
21 | 29 | | |
22 | 30 | | |
23 | | - | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
24 | 36 | | |
25 | | - | |
| 37 | + | |
26 | 38 | | |
27 | 39 | | |
28 | | - | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
29 | 45 | | |
30 | 46 | | |
31 | 47 | | |
32 | | - | |
33 | 48 | | |
34 | | - | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
35 | 65 | | |
36 | | - | |
| 66 | + | |
37 | 67 | | |
38 | | - | |
| 68 | + | |
39 | 69 | | |
40 | | - | |
| 70 | + | |
41 | 71 | | |
42 | | - | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
0 commit comments