Reinforcement Learning
Reinforcement learning (RL) frames problems as agents taking actions in an environment to earn rewards. The goal is to learn a policy that dictates the best action in each situation through exploration and exploitation.
Example: Training a dog where treats reinforce good behavior. Similarly, a robot learns optimal actions by randomly exploring and then reinforcing successful actions. Historic example: AlphaGo learned to play Go by self-play and adjusting strategies based on wins and losses.
Q-learning is a trial-and-error approach where machines learn the value of actions in different states by maintaining a Q-table of state-action pairs with expected rewards.
Example: Teaching a dog to navigate a house. At first, its moves are random; when it finds treats, it remembers which moves worked. Over time, its Q-table builds an internal map, allowing it to choose the best actions. Example: A robot in a maze receiving +10 points for reaching the exit and -5 for hitting walls.