Hi,
I'm training a PPO reinforcement learning model on donkey-waveshare-v0 track. The model is getting overfitted to left-turns around the oval track. The car always turns left with full speed (producing -1, 1 steering-action output). As I'm training on simulator, then is there any way to randomize the start point, reverse the track, or any similar trick that can enable better generalizability of the model?
BR