0%

Statistical Simulation in Python

Python中的统计模拟

Python’s numpy random module is a robust and flexible tool that lets us work with random variables.

https://datascience103579984.wordpress.com/2019/09/26/statistical-simulation-in-python-from-datacamp/


1. Introduction to random variables
  1. 随机变量导论

A random variable is a quantity that can take on multiple values based on random chance. When the variable can take on infinitely many values, it’s called a continuous random variable. Think about the height of a person. Although the height lies within some reasonable limits on average, the actual value could have infinite possibilities in that interval. That is why we term it as a continuous random variable.

随机变量是一个可以基于随机机会取多个值的量。当变量可以取无穷多个值时,称之为连续随机变量。想想一个人的身高。虽然平均高度位于某些合理的限度之内,但实际值在该区间内可能具有无限的可能性。这就是为什么我们称它为连续随机变量。

2. Introduction to random variables
  1. 随机变量导论

Similarly, if the variable can only take a finite set of values, it is called a discrete random variable. The roll of a six-sided die can have only one of six possible outcomes and is thus, considered a discrete random variable. Next, let’s look at probability distributions.

类似地,如果变量只能取一组有限的值,则称之为离散随机变量。六面骰子的滚动只能有六种可能的结果之一,因此,被认为是一个离散的随机变量。接下来,让我们看看概率分布。

3. Probability distributions
  1. 概率分布

A probability distribution is a mapping from the set of possible outcomes of a random variable to the probability of observing that outcome. It tells you how likely you are to observe a given outcome or a set of outcomes. Just like random variables, probability distributions are either discrete or continuous depending on the type of random variable they represent. For continuous random variables, the distribution is represented by a probability density function and probability is typically defined over an interval. The normal distribution is an example of a continuous distribution.

概率分布是从一个随机变量的可能结果集到观察该结果的概率的映射。它告诉你你观察一个给定结果或一系列结果的可能性有多大。就像随机变量一样,概率分布是离散的或连续的,这取决于它们所代表的随机变量的类型。对于连续的随机变量,分布用概率密度函数表示,概率通常定义在一个区间内。正态分布是连续分布的一个例子。

4. Probability distributions
  1. 概率分布

For discrete random variables, the distribution is represented by a probability mass function and probability can be defined at a single point or over an interval. Among discrete distributions, binomial and Poisson distributions are widely used. Python’s numpy random module is a robust and flexible tool that lets us work with random variables.

对于离散的随机变量,分布用概率质量函数表示,概率可以定义在一个点上或者在一个区间上。在离散分布中,二项分布和泊松分布得到了广泛的应用。的 numpy random 模块是一个健壮而灵活的工具,可以让我们处理随机变量。

np.random.choice()

a remarkably useful function for simulations

https://numpy.org/doc/stable/reference/random/generated/numpy.random.choice.html

image-20220127000226886

*:
replace后面的True和False首字母一定大写

numpy.random.poisson

numpy.random.poisson


01302022

Simulation basics

仿真

Steps

Simulations typically involves the following steps.

 1) Define the set of outcomes associated with a random variable. 
 2) Assign a probability to each of these outcomes - the probability distribution. 
 3) Define the relationship between multiple random variables. These three steps essentially describe our statistical model.
 4) Draw samples from the probability distributions. 
 5) Analyze the sample outcomes. 

image-20220130094402454

Simulation involves repeated random sampling. The first step then is to get one random sample. Once we have that, all we do is repeat the process multiple times.

The first two steps of running a simulation - defining a random variable and assigning probabilities.

https://medium.com/@manilwagle/probability-and-simulation-6a28fc1f1cb0

Conditional Probability

Bayes Rule

image-20220130105728817

Independent Events

P(AB)=P(A)P(B)

边际概率?