Hello! I've found a performance issue in your project: batch() should be called before map(), which could make your program more efficient. Here is the tensorflow document to support it.
Detailed description is listed below:
- /train_gpt2.py:
train_dataset.batch(args.batch_size, drop_remainder=True)(here) should be called before ds.map(_parse_function)(here).
- /train_gpt2_keras.py:
train_dataset.batch(args.batch_size, drop_remainder=True)(here) should be called before ds.map(_parse_function)(here) and train_dataset.map(parse_2)(here).
- /train_transformer_xl.py:
train_dataset.batch(args.batch_size, drop_remainder=True)(here) should be called before ds.map(_parse_function)(here).
Besides, you need to check the function called in map()(e.g., _parse_function called in ds.map(_parse_function)) whether to be affected or not to make the changed code work properly. For example, if _parse_function needs data with shape (x, y, z) as its input before fix, it would require data with shape (batch_size, x, y, z).
Looking forward to your reply. Btw, I am very glad to create a PR to fix it if you are too busy.
Hello! I've found a performance issue in your project:
batch()should be called beforemap(), which could make your program more efficient. Here is the tensorflow document to support it.Detailed description is listed below:
train_dataset.batch(args.batch_size, drop_remainder=True)(here) should be called beforeds.map(_parse_function)(here).train_dataset.batch(args.batch_size, drop_remainder=True)(here) should be called beforeds.map(_parse_function)(here) andtrain_dataset.map(parse_2)(here).train_dataset.batch(args.batch_size, drop_remainder=True)(here) should be called beforeds.map(_parse_function)(here).Besides, you need to check the function called in
map()(e.g.,_parse_functioncalled inds.map(_parse_function)) whether to be affected or not to make the changed code work properly. For example, if_parse_functionneeds data with shape (x, y, z) as its input before fix, it would require data with shape (batch_size, x, y, z).Looking forward to your reply. Btw, I am very glad to create a PR to fix it if you are too busy.