
Statistical Simulation in Python


Python’s numpy random module is a robust and flexible tool that lets us work with random variables.


1. Introduction to random variables
  1. 随机变量导论

A random variable is a quantity that can take on multiple values based on random chance. When the variable can take on infinitely many values, it’s called a continuous random variable. Think about the height of a person. Although the height lies within some reasonable limits on average, the actual value could have infinite possibilities in that interval. That is why we term it as a continuous random variable.


2. Introduction to random variables
  1. 随机变量导论

Similarly, if the variable can only take a finite set of values, it is called a discrete random variable. The roll of a six-sided die can have only one of six possible outcomes and is thus, considered a discrete random variable. Next, let’s look at probability distributions.


3. Probability distributions
  1. 概率分布

A probability distribution is a mapping from the set of possible outcomes of a random variable to the probability of observing that outcome. It tells you how likely you are to observe a given outcome or a set of outcomes. Just like random variables, probability distributions are either discrete or continuous depending on the type of random variable they represent. For continuous random variables, the distribution is represented by a probability density function and probability is typically defined over an interval. The normal distribution is an example of a continuous distribution.


4. Probability distributions
  1. 概率分布

For discrete random variables, the distribution is represented by a probability mass function and probability can be defined at a single point or over an interval. Among discrete distributions, binomial and Poisson distributions are widely used. Python’s numpy random module is a robust and flexible tool that lets us work with random variables.

对于离散的随机变量,分布用概率质量函数表示,概率可以定义在一个点上或者在一个区间上。在离散分布中,二项分布和泊松分布得到了广泛的应用。的 numpy random 模块是一个健壮而灵活的工具,可以让我们处理随机变量。


a remarkably useful function for simulations







Simulation basics



Simulations typically involves the following steps.

 1) Define the set of outcomes associated with a random variable. 
 2) Assign a probability to each of these outcomes - the probability distribution. 
 3) Define the relationship between multiple random variables. These three steps essentially describe our statistical model.
 4) Draw samples from the probability distributions. 
 5) Analyze the sample outcomes. 


Simulation involves repeated random sampling. The first step then is to get one random sample. Once we have that, all we do is repeat the process multiple times.

The first two steps of running a simulation - defining a random variable and assigning probabilities.


Conditional Probability

Bayes Rule


Independent Events

