🐛 Describe the bug
torch.onnx.export fails with ValueError when exporting an ExportedProgram that contains register_buffer + in-place tensor assignment.
The root cause is in _handle_call_function_node_with_lowering() in torch/onnx/_internal/exporter/_core.py. When aten.copy.default is translated to ONNX, op.CastLike(src, self) returns the same IR value object as the input (identity passthrough for same-dtype), but line ~141 unconditionally renames it with outputs.name = node.name, destroying the original placeholder name.
Error message
ValueError: Key 'b_prompt_feat' does not match the name of the value 'copy'.
Please use the value.name as the key.
Reproduction
import torch
import torch.nn as nn
class Model(nn.Module):
def __init__(self):
super().__init__()
self.register_buffer('prompt_feat', torch.zeros(1, 4, 8))
self.linear = nn.Linear(8, 8)
def forward(self, x):
out = torch.zeros(1, 4, 8)
out[:, :, :] = self.prompt_feat.to(x.device)
return self.linear(out + x)
model = Model().eval()
x = torch.randn(1, 4, 8)
ep = torch.export.export(model, (x,))
torch.onnx.export(ep, args=(x,), f="test.onnx") # ValueError
Root cause analysis
1. Export creates buffer placeholder
torch.export converts self.prompt_feat to a graph placeholder named b_prompt_feat:
graph_signature.inputs_to_buffers: 'b_prompt_feat' → 'prompt_feat'
2. ONNX decomposition creates aten.copy.default
The in-place copy_ is decomposed into:
[b_prompt_feat] placeholder [1, 4, 8] float32
[slice] aten.slice(zeros, ...)
[copy] aten.copy.default(slice, b_prompt_feat) ← both float32
[slice_scatter] aten.slice_scatter(zeros, copy, ...)
3. aten_copy ONNX handler returns input by reference
# torchlib: aten_copy
@torch_op("aten::copy", trace_only=True)
def aten_copy(self, src, non_blocking=False):
return op.CastLike(src, self)
When src and self have the same dtype (both float32), CastLike returns the same IR value object as src — no new node is created.
4. outputs.name = node.name destroys the placeholder name
In _handle_call_function_node_with_lowering():
# line ~139-141
node_name_to_values[node.name] = outputs # values['copy'] = value_A
outputs.name = node.name # value_A.name = 'copy' ← BUG
Since outputs is the same object as values['b_prompt_feat'], this overwrites the placeholder's name from 'b_prompt_feat' to 'copy'.
5. Initializer registration fails
# _exported_program_to_onnx_program, line ~140
model.graph.initializers['b_prompt_feat'] = value
# key='b_prompt_feat' but value.name='copy' → ValueError
Suggested fix
In _handle_call_function_node_with_lowering(), avoid renaming if the output value is already used by a previous node:
# Before (line ~141):
outputs.name = node.name
# After:
if outputs.producer() is not None:
# Output is a new value produced by a new ONNX node — safe to rename
outputs.name = node.name
# else: output is an existing value (identity passthrough) — do not rename
Alternatively, ensure aten_copy always creates a new ONNX node (e.g. op.Identity(op.CastLike(src, self))).
Versions
- torch: 2.9.0 ~ 2.11.0 (all reproduce the same bug)
- onnx-ir: 0.1.12, 0.2.0 (both reproduce)
- Python: 3.10
cc @justinchuby @titaiwangms
Versions
tested 2.9, 2.10, 2.11
🐛 Describe the bug
torch.onnx.exportfails withValueErrorwhen exporting anExportedProgramthat containsregister_buffer+ in-place tensor assignment.The root cause is in
_handle_call_function_node_with_lowering()intorch/onnx/_internal/exporter/_core.py. Whenaten.copy.defaultis translated to ONNX,op.CastLike(src, self)returns the same IR value object as the input (identity passthrough for same-dtype), but line ~141 unconditionally renames it withoutputs.name = node.name, destroying the original placeholder name.Error message
Reproduction
Root cause analysis
1. Export creates buffer placeholder
torch.exportconvertsself.prompt_featto a graph placeholder namedb_prompt_feat:2. ONNX decomposition creates
aten.copy.defaultThe in-place
copy_is decomposed into:3.
aten_copyONNX handler returns input by referenceWhen
srcandselfhave the same dtype (both float32),CastLikereturns the same IR value object assrc— no new node is created.4.
outputs.name = node.namedestroys the placeholder nameIn
_handle_call_function_node_with_lowering():Since
outputsis the same object asvalues['b_prompt_feat'], this overwrites the placeholder's name from'b_prompt_feat'to'copy'.5. Initializer registration fails
Suggested fix
In
_handle_call_function_node_with_lowering(), avoid renaming if the output value is already used by a previous node:Alternatively, ensure
aten_copyalways creates a new ONNX node (e.g.op.Identity(op.CastLike(src, self))).Versions
cc @justinchuby @titaiwangms
Versions
tested 2.9, 2.10, 2.11