Q-Learning Grid World

Epsilon: 0.90

Interval: 30 ms

Click on Visualize Policy to see the optimal actions for each state. Then click on Start Learning to see how the Q-values change over time.

About This Demo

This page is created as part of supporting material for the book, "Modern Keras: The Comprehensive Guide to Deep Learning with the Keras API and Python" by Mohammad Nauman (recluze). See more about it here.