Hello,I found a performance issue in the definition of main ,
train_gpt2.py,
train_dataset = ds.map(_parse_function) was called without num_parallel_calls.
I think it will increase the efficiency of your program if you add this.
The same issues also exist in train_dataset = ds.map(_parse_function) ,
train_dataset = train_dataset.map(parse_2),
train_dataset = ds.map(_parse_function)
Here is the documemtation of tensorflow to support this thing.
Looking forward to your reply. Btw, I am very glad to create a PR to fix it if you are too busy.
Hello,I found a performance issue in the definition of
main,train_gpt2.py,
train_dataset = ds.map(_parse_function) was called without num_parallel_calls.
I think it will increase the efficiency of your program if you add this.
The same issues also exist in train_dataset = ds.map(_parse_function) ,
train_dataset = train_dataset.map(parse_2),
train_dataset = ds.map(_parse_function)
Here is the documemtation of tensorflow to support this thing.
Looking forward to your reply. Btw, I am very glad to create a PR to fix it if you are too busy.