refactor(optimizer): change "filter_bias_and_bn" to "weight_decay_filter"#752
Merged
geniuspatrick merged 3 commits intomindspore-lab:mainfrom Jan 17, 2024
Merged
Conversation
0a0553c to
dd93647
Compare
compatible with previous bugs.
dd93647 to
c28e3e6
Compare
…ht decay in optim_factory
4132fd8 to
b468028
Compare
b468028 to
4b82c0b
Compare
geniuspatrick
approved these changes
Jan 17, 2024
SamitHuang
approved these changes
Jan 17, 2024
SamitHuang
reviewed
Jan 17, 2024
| weight_decay_filter: filters to filter parameters from weight_decay. | ||
| - "disable": No parameters to filter. | ||
| - "auto": We do not apply weight decay filtering to any parameters. However, MindSpore currently | ||
| automatically filters the parameters of Norm layer from weight decay. |
Collaborator
There was a problem hiding this comment.
Be specific. For 'auto' weight decay, Norm layer , which kind of norm layers will be applied with weight decay? BatchNorm only or including LayerNorm
Collaborator
Author
There was a problem hiding this comment.
All the norm layers BatchNorm and LayerNorm.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Thank you for your contribution to the MindCV repo.
Before submitting this PR, please make sure:
Motivation
Fix the bug with weight decay when set
filter_bias_and_bn:filter_bias_and_bnisTrue: functioninit_group_paramsdoes not set value of weight decay forno_decay_params, in this case, mindspore will useweight_decayof optimizer (usually not 0.0).filter_bias_and_bnisFalse: mindspore will automatically filter BatchNorm params from weight_decay.So the name of the argument is not the same as what it actually does. And we can never filter out the param of bias and norm layer from doing weight decay, as the name
filter_bias_and_bn.Due to this, we refactor it to
weight_decay_filter:"disable": No parameters to filter."auto": We do not apply weight decay filtering to any parameters. However, MindSpore currently automatically filters the parameters of Norm layer from weight decay."norm_and_bias": Filter the paramters of Norm layer and Bias from weight decay.How do I migrate from an old configuration?
True"disable"False"auto"BTW, we also support get no_weight_decay list from model and layer_decay.
Test Plan
(How should this PR be tested? Do you require special setup to run the test or repro the fixed bug?)
Related Issues and PRs
(Is this PR part of a group of changes? Link the other relevant PRs and Issues here. Use https://help.github.com/en/articles/closing-issues-using-keywords for help on GitHub syntax)