# Reinforcement Learning ______ ## Q-Learning - Is an algorithm where the agent attempts to learn what the optimal policy is from its history of interacting with the environment. **Steps** - We first initialize the `Q-Table`, which is the backbone of the Q-Learning Algorithm. - This is what stores all the Q-Values for any given state/action pair the agent will encounter in the environment. - In order to start populating these values with meaningful numbers the agent needs to randomly select an action at any given state and collect the associated reward. - If the action was bad then the Q-Value that state/action pair will decrease. On the other hand, if the action was good then the opposite happens and the `Q-Value` increases. - Eventually at some point the agent needs to stop exploring and start exploiting the values and information in the Q-Table. - This is where the policies such as `epsilon-greedy` come into play.