When the location key of the external data field contains a plain file name the tensors are not loaded and onnx2trt fails with the following:
$ onnx2trt -o dlrm_s_pytorch.trt dlrm_s_pytorch.onnx
----------------------------------------------------------------
Input filename: dlrm_s_pytorch.onnx
ONNX IR version: 0.0.6
Opset version: 11
Producer name: pytorch
Producer version: 1.8
Domain:
Model version: 0
Doc string:
----------------------------------------------------------------
Parsing model
[2020-10-15 05:32:39 WARNING] [TRT]/local/tensorRT/onnx-tensorrt/onnx2trt_utils.cpp:220: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[2020-10-15 05:32:39 ERROR] [TRT]/local/tensorRT/onnx-tensorrt/onnx2trt_utils.cpp:1312: Failed to open file:
ERROR: /local/tensorRT/onnx-tensorrt/ModelImporter.cpp:92 In function parseGraph:
[8] Assertion failed: convertOnnxWeights(initializer, &weights, ctx)
The attached dlrm model uses external tensors where location is "bot_l.0.weight" and the expectation is that the weights will be loaded from the same directory as the model.
The following output from onnx python shows the values of location for the weights which contain only the file name and no path:
>>> import onnx
>>> m = onnx.load('dlrm_s_pytorch.onnx')
>>> [(i.name, i.data_type, i.data_location, i.external_data) for i in m.graph.initializer if i.name.endswith('weight')]
[('bot_l.0.weight', 1, 1, [key: "location"
value: "bot_l.0.weight"
]), ('bot_l.2.weight', 1, 1, [key: "location"
value: "bot_l.2.weight"
]), ('emb_l.0.weight', 1, 1, [key: "location"
value: "emb_l.0.weight"
]), ('emb_l.1.weight', 1, 1, [key: "location"
value: "emb_l.1.weight"
]), ('emb_l.2.weight', 1, 1, [key: "location"
value: "emb_l.2.weight"
]), ('emb_l.3.weight', 1, 1, [key: "location"
value: "emb_l.3.weight"
]), ('top_l.0.weight', 1, 1, [key: "location"
value: "top_l.0.weight"
]), ('top_l.2.weight', 1, 1, [key: "location"
value: "top_l.2.weight"
]), ('top_l.4.weight', 1, 0, [])]
Note that this onnx model does appear to have both external_data and raw_data for the weights but that doesn't appear to effect this issue.
The zip file contains the top and bottom MLP weights but excludes the embedding tables since they would make the zip file too large to upload here.
dlrm-external-tensors.zip
The following change appears to resolve the issue:
$ git diff
diff --git a/onnx2trt_utils.cpp b/onnx2trt_utils.cpp
index ceff92d..0c905a6 100644
--- a/onnx2trt_utils.cpp
+++ b/onnx2trt_utils.cpp
@@ -1306,6 +1306,10 @@ bool parseExternalWeights(IImporterContext* ctx, std::string file, std::string p
{
path.replace(slash + 1, path.size() - (slash + 1), file);
}
+ else
+ {
+ path = file;
+ }
std::ifstream relPathFile(path, std::ios::binary | std::ios::ate);
if (!relPathFile)
{
When the location key of the external data field contains a plain file name the tensors are not loaded and onnx2trt fails with the following:
The attached dlrm model uses external tensors where location is "bot_l.0.weight" and the expectation is that the weights will be loaded from the same directory as the model.
The following output from onnx python shows the values of location for the weights which contain only the file name and no path:
Note that this onnx model does appear to have both external_data and raw_data for the weights but that doesn't appear to effect this issue.
The zip file contains the top and bottom MLP weights but excludes the embedding tables since they would make the zip file too large to upload here.
dlrm-external-tensors.zip
The following change appears to resolve the issue: