Skip to content

MLM 预训练有问题 #15

@rattlesnakey

Description

@rattlesnakey

MASK token 预测的时候,它是不会和其他MASK token 做self-attention的,所以其他的mask token 的attention_mask 要为0

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions