Hi, thanks for sharing this code repository! I just have a slight issue with KeyErrors during model training using custom data:
06/22/2020 22:16:45 - INFO - transformers.tokenization_utils - loading file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt from cache at /home/msdfamily/.cache/torch/transformers/26bc1ad6c0ac742e9b52263248f6d0f00068293b33709fae12320c0e35ccfbbb.542ce4285a40d23a559526243235df47c5f75c197f04f37d1a0c124c32c9a084
06/22/2020 22:16:45 - INFO - data_loader - Loading features from cached file ./data/cached_train_snips_bert-base-uncased_50
06/22/2020 22:16:45 - INFO - data_loader - Loading features from cached file ./data/cached_dev_snips_bert-base-uncased_50
06/22/2020 22:16:45 - INFO - data_loader - Loading features from cached file ./data/cached_test_snips_bert-base-uncased_50
06/22/2020 22:16:46 - INFO - transformers.configuration_utils - loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json from cache at /home/msdfamily/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.7156163d5fdc189c3016baca0775ffce230789d7fa2a42ef516483e4ca884517
06/22/2020 22:16:46 - INFO - transformers.configuration_utils - Model config BertConfig {
"architectures": [
"BertForMaskedLM"
],
"attention_probs_dropout_prob": 0.1,
"finetuning_task": "snips",
"hidden_act": "gelu",
"hidden_dropout_prob": 0.1,
"hidden_size": 768,
"initializer_range": 0.02,
"intermediate_size": 3072,
"layer_norm_eps": 1e-12,
"max_position_embeddings": 512,
"model_type": "bert",
"num_attention_heads": 12,
"num_hidden_layers": 12,
"pad_token_id": 0,
"type_vocab_size": 2,
"vocab_size": 30522
}
06/22/2020 22:16:46 - INFO - transformers.modeling_utils - loading weights file https://cdn.huggingface.co/bert-base-uncased-pytorch_model.bin from cache at /home/msdfamily/.cache/torch/transformers/f2ee78bdd635b758cc0a12352586868bef80e47401abe4c4fcc3832421e7338b.36ca03ab34a1a5d5fa7bc3d03d55c4fa650fed07220e2eeebc06ce58d0e9a157
06/22/2020 22:16:48 - INFO - transformers.modeling_utils - Weights of JointBERT not initialized from pretrained model: ['intent_classifier.linear.weight', 'intent_classifier.linear.bias', 'slot_classifier.linear.weight', 'slot_classifier.linear.bias']
06/22/2020 22:16:48 - INFO - transformers.modeling_utils - Weights from pretrained model not used in JointBERT: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias']
06/22/2020 22:16:50 - INFO - trainer - ***** Running training *****
06/22/2020 22:16:50 - INFO - trainer - Num examples = 13084
06/22/2020 22:16:50 - INFO - trainer - Num Epochs = 10
06/22/2020 22:16:50 - INFO - trainer - Total train batch size = 32
06/22/2020 22:16:50 - INFO - trainer - Gradient Accumulation steps = 1
06/22/2020 22:16:50 - INFO - trainer - Total optimization steps = 4090
06/22/2020 22:16:50 - INFO - trainer - Logging steps = 200
06/22/2020 22:16:50 - INFO - trainer - Save steps = 200
Epoch: 0%| | 0/10 [00:00<?, ?it/s/pytorch/torch/csrc/utils/python_arg_parser.cpp:756: UserWarning: This overload of add_ is deprecated: | 0/409 [00:00<?, ?it/s]
add_(Number alpha, Tensor other)
Consider using one of the following signatures instead:
add_(Tensor other, *, Number alpha)
06/22/2020 22:21:18 - INFO - trainer - ***** Running evaluation on dev dataset ***** | 199/409 [04:27<05:33, 1.59s/it]
06/22/2020 22:21:18 - INFO - trainer - Num examples = 700
06/22/2020 22:21:18 - INFO - trainer - Batch size = 64
Evaluating: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 11/11 [00:09<00:00, 1.14it/s]
Iteration: 49%|██████████████████████████████████████████████████████████████▊ | 199/409 [04:38<04:53, 1.40s/it]
Epoch: 0%| | 0/10 [04:38<?, ?it/s]
Traceback (most recent call last):
File "main.py", line 72, in
main(args)
File "main.py", line 20, in main
trainer.train()
File "/home/msdfamily/projects/project-bert/JointBERT/trainer.py", line 105, in train
self.evaluate("dev")
File "/home/msdfamily/projects/project-bert/JointBERT/trainer.py", line 209, in evaluate
out_slot_label_list[i].append(slot_label_map[out_slot_labels_ids[i][j]])
KeyError: 32
Would you be able to help?
Hi, thanks for sharing this code repository! I just have a slight issue with KeyErrors during model training using custom data:
06/22/2020 22:16:45 - INFO - transformers.tokenization_utils - loading file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt from cache at /home/msdfamily/.cache/torch/transformers/26bc1ad6c0ac742e9b52263248f6d0f00068293b33709fae12320c0e35ccfbbb.542ce4285a40d23a559526243235df47c5f75c197f04f37d1a0c124c32c9a084
06/22/2020 22:16:45 - INFO - data_loader - Loading features from cached file ./data/cached_train_snips_bert-base-uncased_50
06/22/2020 22:16:45 - INFO - data_loader - Loading features from cached file ./data/cached_dev_snips_bert-base-uncased_50
06/22/2020 22:16:45 - INFO - data_loader - Loading features from cached file ./data/cached_test_snips_bert-base-uncased_50
06/22/2020 22:16:46 - INFO - transformers.configuration_utils - loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json from cache at /home/msdfamily/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.7156163d5fdc189c3016baca0775ffce230789d7fa2a42ef516483e4ca884517
06/22/2020 22:16:46 - INFO - transformers.configuration_utils - Model config BertConfig {
"architectures": [
"BertForMaskedLM"
],
"attention_probs_dropout_prob": 0.1,
"finetuning_task": "snips",
"hidden_act": "gelu",
"hidden_dropout_prob": 0.1,
"hidden_size": 768,
"initializer_range": 0.02,
"intermediate_size": 3072,
"layer_norm_eps": 1e-12,
"max_position_embeddings": 512,
"model_type": "bert",
"num_attention_heads": 12,
"num_hidden_layers": 12,
"pad_token_id": 0,
"type_vocab_size": 2,
"vocab_size": 30522
}
06/22/2020 22:16:46 - INFO - transformers.modeling_utils - loading weights file https://cdn.huggingface.co/bert-base-uncased-pytorch_model.bin from cache at /home/msdfamily/.cache/torch/transformers/f2ee78bdd635b758cc0a12352586868bef80e47401abe4c4fcc3832421e7338b.36ca03ab34a1a5d5fa7bc3d03d55c4fa650fed07220e2eeebc06ce58d0e9a157
06/22/2020 22:16:48 - INFO - transformers.modeling_utils - Weights of JointBERT not initialized from pretrained model: ['intent_classifier.linear.weight', 'intent_classifier.linear.bias', 'slot_classifier.linear.weight', 'slot_classifier.linear.bias']
06/22/2020 22:16:48 - INFO - transformers.modeling_utils - Weights from pretrained model not used in JointBERT: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias']
06/22/2020 22:16:50 - INFO - trainer - ***** Running training *****
06/22/2020 22:16:50 - INFO - trainer - Num examples = 13084
06/22/2020 22:16:50 - INFO - trainer - Num Epochs = 10
06/22/2020 22:16:50 - INFO - trainer - Total train batch size = 32
06/22/2020 22:16:50 - INFO - trainer - Gradient Accumulation steps = 1
06/22/2020 22:16:50 - INFO - trainer - Total optimization steps = 4090
06/22/2020 22:16:50 - INFO - trainer - Logging steps = 200
06/22/2020 22:16:50 - INFO - trainer - Save steps = 200
Epoch: 0%| | 0/10 [00:00<?, ?it/s/pytorch/torch/csrc/utils/python_arg_parser.cpp:756: UserWarning: This overload of add_ is deprecated: | 0/409 [00:00<?, ?it/s]
add_(Number alpha, Tensor other)
Consider using one of the following signatures instead:
add_(Tensor other, *, Number alpha)
06/22/2020 22:21:18 - INFO - trainer - ***** Running evaluation on dev dataset ***** | 199/409 [04:27<05:33, 1.59s/it]
06/22/2020 22:21:18 - INFO - trainer - Num examples = 700
06/22/2020 22:21:18 - INFO - trainer - Batch size = 64
Evaluating: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 11/11 [00:09<00:00, 1.14it/s]
Iteration: 49%|██████████████████████████████████████████████████████████████▊ | 199/409 [04:38<04:53, 1.40s/it]
Epoch: 0%| | 0/10 [04:38<?, ?it/s]
Traceback (most recent call last):
File "main.py", line 72, in
main(args)
File "main.py", line 20, in main
trainer.train()
File "/home/msdfamily/projects/project-bert/JointBERT/trainer.py", line 105, in train
self.evaluate("dev")
File "/home/msdfamily/projects/project-bert/JointBERT/trainer.py", line 209, in evaluate
out_slot_label_list[i].append(slot_label_map[out_slot_labels_ids[i][j]])
KeyError: 32
Would you be able to help?