GaTech Reinforcement Learning HW1: Value Iteration HW2: TD($$\lambda$$) -- n-step TD HW3: SARSA HW4: Q-Learning HW5: KWIK HW6: Game Theory/LP for Rock-paper-stone Project1: reproduce TD($$\lambda$$) Project2: Deep Q-Learning for LunaLander Project3: Q-Learning, Friend/Foe-Q, CE-Q for soccer gamer