0%

DATA606 Presentation

Group 4

Member

  1. Yibo Zhang
  2. Chao cao
  3. Jiawei Luo
  4. Tapen Katipelli

Sections

2 Chao

​ Markov Chain Monte Carlo

2.1 Chao

​ Estimating Expectations with Markov Chains

2.2 Tapan

​ Ideal Behavior

2.3 Jou L

​ Pathological Behavior

2.4 Sam Zhang

​ The Metropolis-Hastings Algorithm

2 Markov Chain Monte Carlo

What is MCMC?

Markov chain Monte Carlo (MCMC) methods comprise a class of algorithms for sampling from a probability distribution. By constructing a Markov chain that has the desired distribution as its equilibrium distribution, one can obtain a sample of the desired distribution by recording states from the chain. The more steps are included, the more closely the distribution of the sample matches the actual desired distribution.

Markov chain Monte Carlo methods create samples from a continuous random variable, with probability density proportional to a known function. These samples can be used to evaluate an integral over that variable, as its expected value or variance.

Difference between Markov chain and Monte Carlo

Markov Chain Monte Carlo sampling provides a class of algorithms for systematic random sampling from high-dimensional probability distributions.

Unlike Monte Carlo sampling methods that are able to draw independent samples from the distribution, Markov Chain Monte Carlo methods draw samples where the next sample is dependent on the existing sample, called a Markov Chain.

Tips

Markov chain Monte Carlo uses a Markov chain to stochastically explore the typical set, generating a random grid across the region of high probability from which we can struct accurate expectation estimates. Given sufficient computational resources a properly designed Markov chain will eventually explore the typical set of any distribution.

MC methods are a class of methods, of which MCMC is one possibility.

Even MCMC does not uniquely define your method as there are different variations of MCMC.

2.1 Estimating Expectations with Markov Chains

A Markov chain is a progression of points in parameter space generated by sequentially applying a random map known as a Markov transition. Alternatively, we can think of a Markov transition as a conditional probability density, T(q’ | q), defining to which point, q’, we are most likely to jump from the initial point, q

image-20220307162633504

So long as this condition holds, at every initial point the Markov transition will concentrate towards the typical set. Consequently, no matter where we begin in parameter space the corresponding Markov chain will eventually drift into, and then across, the typical set.

Given sufficient time, the history of the Markov chain, {q0, . . . , qN}, denoted samples generated by the Markov chain, becomes a convenient quantification of the typical set.

In particular, we can estimate expectations across the typical set, and hence expectations across the entire parameter space, by averaging the target function over this history.

image-20220307162621908

Limitation for expectation

As we run the Markov chain for longer and longer, it will better explore the typical set and, up to some technicalities, these Markov chain Monte Carlo estimators will converge to the true expectations.

image-20220307162604107

Unfortunately, this asymptotic behavior is of limited use in practice because we do not have the infinite computational resources to ensure that we can always run a Markov chain long enough to achieve sufficient exploration. In order to develop a robust tool we need to understand how Markov chains behave after only a finite number of transitions.

Document

image-20220307162926652

image-20220307162950500

2.1 Estimating Expectations with Markov Chains

image-20220307163040180

image-20220307163103771

image-20220307163124692

Conclusions

Markov chain Monte Carlo uses a Markov chain to stochastically explore the typical set,generating a random grid across the region of high probability from which we can construct accurate expectation estimates.

MCMC methods are primarily used for calculating numerical approximations of multi-dimensional integrals

By constructing a Markov chain that has the desired distribution as its equilibrium distribution, one can obtain a sample of the desired distribution by recording states from the chain. The more steps are included, the more closely the distribution of the sample matches the actual desired distribution.

Varisons:

Not a single algorithm. Various algorithms exist for constructing chains, including the Metropolis–Hastings algorithm.

Problems:

While MCMC methods were created to address multi-dimensional problems better than generic Monte Carlo algorithms, when the number of dimensions rises they too tend to suffer the curse of dimensionality: regions of higher probability tend to stretch and get lost in an increasing volume of space that contributes little to the integral.

Markov chain uses a random map known as a Markov transition. Nothing will happen unless Markov transition preserves the target distribution, which means we generated a ensemble of samples fromthe target distribution and applied the transition then we would get a new ensemble thatwas still distributed according to the target distribution

We can estimate expectations across the typical set, and hence expectationsacross the entire parameter space, by averaging the target function over this history.

As we run the Markov chain for longer and longer, it will better explore the typical set