Prefix tuning with "gated add" for Roberta 

Hi, thanks for publishing the paper and sharing the source code.

I found that the "attn_output"  is not used after definition.  
When learning roberta for parameter efficient learning, the paper version of prefix tuning does not seem to work properly.

Could you please check it out?! 

https://github.com/jxhe/unify-parameter-efficient-tuning/blob/3222ce2c0079566a28043e22380eb4ab6ad14389/src/transformers/models/roberta/modeling_roberta.py#L391-L392

Thanks! 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prefix tuning with "gated add" for Roberta #20

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

	if self.config.attn_mode != "none" and self.config.attn_composition == "gate_add":
	attn_output = context_layer * w_attn + cross_attn_output * w_prefix

Prefix tuning with "gated add" for Roberta #20

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions