Probability for Machine Learning, undefined

Markov Chains

Markov chains model systems that transition between states based solely on the current state, not the past history. They follow the Markov property: the future depends on the present, but not on the past. Random walks are a special case of Markov chains where transitions follow simple rules.

Formally, a process {Xₙ} is a Markov chain if P(Xₙ₊₁ = j | Xₙ = i, Xₙ₋₁ = iₙ₋₁, ..., X₁ = i₁) = P(Xₙ₊₁ = j | Xₙ = i) for all states i, j and times n. This conditional probability P(Xₙ₊₁ = j | Xₙ = i) is called the transition probability from state i to state j.

The "memoryless" property of Markov chains makes them computationally tractable while still capturing complex behavior. A Markov chain is fully described by its transition matrix P, where element Pᵢⱼ represents the probability of moving from state i to state j in one step.

In machine learning, Markov chains are used for sequence modeling, text generation, recommendation systems, and reinforcement learning. Hidden Markov Models (HMMs) extend this concept to situations where the states are not directly observable, which is valuable for speech recognition and bioinformatics.

PageRank, the algorithm that powered Google's early search engine success, is fundamentally a Markov chain model where web pages are states and links represent transitions. By analyzing the stationary distribution of this Markov chain, PageRank identifies the most important pages on the web.