Stardj is coding

Probability and Other Preliminaries

Probability and Other Preliminaries

probability distribution
rules of probability
review for linear algebra

Preview

topics this week mainly concerns a brief review for probability and linear algebra
lab class introduces use of Jupyter, Python and Pandas

Probability Distribution

The probability distribution function of a random variable \(X\) is

$$ F(x) = P(X \leq x) $$

where the notation \(\{X \leq x\}\) consists of all outcomes smaller than or equal to \(x\).

The derivative

$$ p(x) = \frac{dF(x)}{dx} $$

is called the probability density function of \(X\) .

Wikipedia: Probability distribution / Probability density function

Joint Distribution

The joint distribution function of two random variables \(X\) and \(Y\) is the probability of the joint statistics \(\{X \leq x, Y \leq y\}\), ie,

$$ F(x, y) = P(X \leq x, Y \leq y) $$

The derivative

$$ p(x, y) = \frac{\partial^2 F(x, y)}{\partial x \partial y} $$

is called the joint density function of \(X\) and \(Y\) .

\(X\) and \(Y\) are independent if and only if \(p(x,y) = p(x)p(y)\)

Wikipedia: Joint probability distribution

Conditional Distribution

Given two jointly distributed random variables \(X\) and \(Y\), the conditional probability distribution of \(Y\) given \(X\) is the probability distribution of \(Y\) when \(X\) is known to be a particular value.

The conditional density function of \(y\) given the occurrence of the value \(x\) is

$$ p(y|x) = \frac{p(x,y)}{p(x)} $$

Wikipedia: Conditional probability distribution

Gaussian (Normal) Distribution

The probability density of the Gaussian distribution is

$$ p(x\ |\ \mu,\sigma^2) = \frac{1}{\sqrt{2\pi\sigma^2}} \exp\left( -\frac{(x-\mu)^2}{2\sigma^2} \right) $$

where \(\mu\) is the mean and \(\sigma^2\) is the variance of the distribution.

very common in natural and social sciences
\(\sigma\) is the standard deviation

Wikipedia: Normal distribution

The Normal (Gaussian) Distribution

about 68% of values drawn from a Gaussian distribution are within one standard deviation \(\sigma\) from the mean \(\mu\)

Multivariate Gaussian (Normal) Distribution

The probability density of the \(k\)-dimensional Gaussian distribution is

$$ p(\mathbf{x}\ |\ \boldsymbol{\mu},\boldsymbol{\Sigma}) = \frac{1}{\sqrt{2\pi^k |\boldsymbol{\Sigma}|}} \exp\left( -\frac{1}{2} (\mathbf{x}-\boldsymbol{\mu})^\top \boldsymbol{\Sigma}^{-1} (\mathbf{x}-\boldsymbol{\mu}) \right) $$

where \(\boldsymbol{\mu}\) is the \(k\times 1\) mean vector and \(\boldsymbol{\Sigma}\) is the \(k\times k\) covariance matrix.

\(|\boldsymbol{\Sigma}|\) and \(\boldsymbol{\Sigma}^{-1}\) are the determinant and the inverse of the covariance
a symbol \(~^\top\) indicates the transpose

Wikipedia: Multivariate normal distribution

Notation

Formally we should write out \(p(X=x,Y=y)\) .

In practice we often use \(p(x,y)\) .

this looks very much like we might write a multivariate function, eg, \(f(x,y) = \frac{x}{y}\)
for a multivariate function though, \(f(x,y)\neq f(y,x)\)
however \(p(x,y) = p(y,x)\) because \(p(X=x,Y=y) = p(Y=y,X=x)\)

We now quickly review the rules of probability.

Normalisation

All distributions are normalised.

this is clear from the fact that \(\sum_{x\in {\cal X}} n_{x} = N\), which gives
$$ \sum_{x\in {\cal X}} p(x) = \lim_{N\rightarrow\infty} \frac{\sum_{x\in {\cal X}} n_x}{N} = \lim_{N\rightarrow\infty} \frac{N}{N} = 1 $$

A similar result can be derived for the marginal and conditional distributions.

Product Rule and Sum Rule

The product rule of probability:

$$ \underbrace{p(x,y)}_{\text{joint probability}} = \underbrace{p(y|x)}_{\text{conditional probability}}\cdot\ p(x) $$

The sum rule of probability:

$$ \underbrace{p(y)}_{\text{marginal probability}} = \sum_{x\in {\cal X}} p(x,y) = \sum_{x\in {\cal X}} p(y|x)p(x) $$

Wikipedia: Product rule / Probability axioms

Bayes' Theorem

Bayes' theorem immediately follows the product rule:

$$ p(x|y) = \frac{p(x,y)}{p(y)} = \frac{p(y|x)p(x)}{\displaystyle \sum_{x\in {\cal X}} p(y|x)p(x)} $$

Wikipedia: Bayes' theorem

(Example) There are two barrels in front of you. Barrel One(B1) contains 20 apples and 4 oranges. Barrel Two(B2) contains 4 apples and 8 oranges. You choose a barrel randomly and select a fruit. It is an apple. What is the probability that the barrel was Barrel One?

(Solution) we are given that:

$$ \begin{align*} p(\text{apple}\ |\ \text{B}_1) = & \frac{20}{24} & \qquad p(\text{B}_1) = 0.5 \\ p(\text{apple}\ |\ \text{B}_2) = & \frac{4}{12} & \qquad p(\text{B}_2) = 0.5 \end{align*} $$

Use the sum rule to calculate

$$ p(\text{apple}) = p(\text{apple}\ |\ \text{B}_1)p(\text{B}_1) + p(\text{apple}\ |\ \text{B}_2)P(\text{B}_2) = \frac{20}{24}\times 0.5 + \frac{4}{12}\times 0.5 = \frac{7}{12} $$

and Bayes' theorem tells us that:

$$ p(\text{B}_1\ |\ \text{apple}) = \frac{p(\text{apple}\ |\ \text{B}_1)P(\text{B}_1)}{P(\text{apple})} = \frac{\frac{20}{24}\times 0.5}{\frac{7}{12}} = \frac{5}{7} $$

Expected Value

The expected value (or mean, average) of a random variable \(X\) is

$$ \mathbb{E}[X] = \int_{-\infty}^{\infty} xp(x) dx $$

discrete type is \(\mathbb{E}[X] = \sum_{x\in {\cal X}} x p(x)\) for all possible events \({\cal X}\)

The expected value of a function \(f(x)\) is

$$ \mathbb{E}[f(x)] = \int_{-\infty}^{\infty} f(x) p(x) dx $$

Wikipedia: Expected value

Variance

The variance is the expected value of \(f(x) = (x - \mathbb{E}[X])^2\), ie,

$$ \mathbb{V}ar[X] = \mathbb{E}[(X - \mathbb{E}[X])^2] = \int_{-\infty}^{\infty} (x - \mathbb{E}[X])^2 p(x) dx $$

discrete type is \(\mathbb{V}ar[X] = \sum_{x\in {\cal X}} (x - \mathbb{E}[X])^2 p(x)\)

(note) $ \mathbb{V}ar[X] = \mathbb{E}[(X - \mathbb{E}[X])^2] = \mathbb{E}[X^2 - 2X\mathbb{E}(X) + \mathbb{E}[X]^2] = \mathbb{E}[X^2] - 2\mathbb{E}(X)\mathbb{E}(X) + \mathbb{E}[X]^2 = \mathbb{E}[X^2] - \mathbb{E}[X]^2 $

Wikipedia: Variance

Derivatives with Vectors

We have scalars \(x\), \(y\), and \(n\)- and \(m\)-dimensional vectors \(\mathbf{x}\), \(\mathbf{y}\), where

$$ \mathbf{x} = \left( \begin{array}{c} x_1 \\ \vdots \\ x_n \end{array} \right) \qquad \mathbf{y} = \left( \begin{array}{c} y_1 \\ \vdots \\ y_m \end{array} \right) $$

Derivatives with vectors using the denominator-layout notation:

$$ \frac{\partial \mathbf{y}}{\partial x} = \left( \frac{\partial y_1}{\partial x} \cdots \frac{\partial y_m}{\partial x} \right) \qquad \frac{\partial y}{\partial \mathbf{x}} = \left( \begin{array}{c} \frac{\partial y}{\partial x_1} \\ \vdots \\ \frac{\partial y}{\partial x_n} \end{array} \right) \qquad \frac{\partial \mathbf{y}}{\partial \mathbf{x}} = \left( \begin{array}{ccc} \frac{\partial y_1}{\partial x_1} & \cdots & \frac{\partial y_m}{\partial x_1} \\ \vdots & & \vdots \\ \frac{\partial y_1}{\partial x_n} & \cdots & \frac{\partial y_m}{\partial x_n} \end{array} \right) $$

Some Scalar-by-Vector Identities

For vectors \(\mathbf{a}\), \(\mathbf{w}\) and a square matrix \(\mathbf{A}\) :

$$ \begin{align*} \frac{\partial \mathbf{a}^\top \mathbf{w}}{\partial \mathbf{w}} = & \mathbf{a} \\ \frac{\partial \mathbf{w}^\top \mathbf{A} \mathbf{w}}{\partial \mathbf{w}} = & (\mathbf{A} + \mathbf{A}^\top)\mathbf{w} \end{align*} $$

Wikipedia: Matrix calculus

Other articles

Flow Chart

Sun 10 April 2016
By stardj

In IELTS.

tags: Writting

This article is a summary of IELTS flowchart writting
read more
Map

Sun 10 April 2016
By stardj

In IELTS.

tags: Writting

This article is a summary of IELTS Map describtion writting
read more
Statistic writting

Sun 10 April 2016
By stardj

In IELTS.

tags: Writting

This article is a summary of IELTS statistic writting
read more
Linux常用指令

Fri 25 December 2015
By stardj

In Uncategorized.

tags: 备忘录

Linux常用指令
read more
secureCRT如何使用公共密钥登录

Sun 20 December 2015
By stardj

In Uncategorized.

tags: 备忘录

secureCRT如何使用公共密钥登录
read more
HIVE 1.2.1 的安装和配置

Sun 20 December 2015
By stardj

In Uncategorized.

tags: 备忘录

HIVE 1.2.1 的安装和配置
read more
MAC使用SSH远程登录

Sun 20 December 2015
By stardj

In Uncategorized.

tags: 备忘录

MAC使用SSH远程登录
read more
action dao service domain util common的含义

Sun 20 December 2015
By stardj

In Uncategorized.

tags: 备忘录

action dao service domain util common的含义
read more
如何利用Github-Pages建立个人博客

Sat 21 November 2015
By stardj

In Uncategorized.

tags: Github-Pages Blogs

How to make personal blogs with Github-Pages.
read more

Page 1 / 2 »