Reinforcement Learning(RL), as simply stated in Sutton and Barto Sutton and Barto (2018) , is learning what to do i.e map situations to actions to maximize the cumulative reward for a series of steps. Naturally, trial and error is at the core of Reinforcement…