Part 1. Basic Discrete Probability
β¦
Conditional Probability Rules
- For any events with
- Multiplication Rule:
- Law for Total Probability: Suppose events form a partition of . Then for any event
- Bayesβ Theorem: Suppose events form a partition of . Then for any event
Independent Events
- are independent β
- NOT β are independent
Β
Conditional Independence
Suppose events are not necessarily independent, but there is another event such that
then we say that and are independent conditional on .
Part 2. Random Variables
β¦
Expectation
Remark:
- Some distributions do not have expectation, for example, the Cauchy Distribution (fat tails).
- If are random variables, and are constants, then:
- if and are finite
- if and are independent.
- Jensen inequality:
- If is convex on the support of the random variable , then
- If is concave on the support of the random variable , then
Rule for the Lazy Statistician
Discrete version:
this is a theorem not a definition of
Continuous version:
Joint distribution version:
Bernoulli() Distribution
Binomial() Distribution
Geometric() Distribution
Poisson() Distribution
Remark:
- let , as n grows, converges to Poisson()
- if independent,
Uniform() Distribution
Pareto() Distribution
Β
Remark:
- . This is often used for modelling the tail of a distribution
Exponential() Distribution
Remark:
- Memoryless property: If I model the time between βeventsβ as exponential, the prob time to next event is greater than a units is the same, no matter how long it has been since the last event. Suppose , and . Then
Gamma() Distribution
For ,
where
Remark: if
- ,
- If , then
- For any , the random variable
- If are independent and , then
- The sum of n independent random variables has the distribution (Erlang Distribution).
Normal() Distribution
Remark:
- For and any ,
- Linear combinations of independent normal random variables are also normally distributed.
The Poisson Process
Poisson distribution can be derived from the limit of binomial(). Imagine divide the interval into subintervals. When the interval extended indefinitely, this leads to Poisson Process with rate .
βEvents occur as a Poisson Process with rate β means
- If equals the number of events during an interval of time of length , then has the Poisson() distribution
- The times between events are random variables with the distribution
- The times between events are independent random variables
- The numbers of events in disjoint time intervals are independent random variables.
- The waiting time for events has the Gamma() distribution. (sum of k distributions)
Beta() Distribution
Remark:
- If , then the distribution is symmetric about 0.5
Summary Table
Part 3: Multivariate Distributions
Joint Probability Function
For discrete random variables
Joint PMF:
Β
Joint CDF
For continuous random variables
Joint PDF
- To calculate probabilities, integrate the pdf over the region of interest:
- A can be any shape
Marginal Distributions
For discrete case
For continuous case
Remark:
- for random variables
Conditional Distribution
for discrete case
for continuous case
with called the conditional desity of given .
Conditional Distribution V.S. Marginal Distribution
- for case I: are independent
- for case II: are weakly (positively) dependent
- for case III: are strongly (positively) dependent
are all the same for three cases. While are not the same, for example, as shown in the second plot.
Covariance & Correlation
properties:
- ,
- independent β . NOT β independent
- if β
Β
More than two random variables
Let denote two vectors of scalars, is a non-random matrix
Bivariate Normal Distribution
has the bivariate normal distribution if it has joint pdf
where is the vector of means and is the covariance matrix
Equally definition:
has the bivariate normal distribution iff all linear combinations of and are also normally distributed.
Properties:
- are normally distributed, i.e., the marginals are normal
- conditional distribution
- are independent.
- are independent β is bivariate normal
Multivariate Normal Distribution
has the multivariate normal distribution if it has joint pdf
where is positive definite (otherwise the inverse might not exist)
Properties:
- All marginal and conditional distributions are multivariate normal.
- Any random variable/vector of the form will be multivariate normal if is positive definite.
- If , then are independent.
- If is diagonal, are independent
Part 4: Conditional Expectation
For discrete random , when conditioning on an event
when the event is an event concerning itself, the .
- E.g. ,
For continuous case,
Laws of total Probability
Prior and Posterior Distributions
In Bayesian analysis, before data is observed, the unknown parameter is modeled as a random variable having a probability distribution , called prior distribution. This distribution represents our prior belief about the value of this parameter. After observing data, we have increased our knowledge about the parameter . The equation is
Iterated Conditioning
where
Useful euqation:
proof:
Measure-Theoretic Notions of Conditional Expectation
βInformationβ can be captured via a collection of subsets of . Such collections are denoted using ,etc. Information means, we can tell elements in occur or not (instead of unsure).
When satisfies having properties, it is a -field or -algebra
- you can always know does not occur
- If , then
- If you know the occurrence / non-occurrence status of , then you also know status of
- If , then
- If you know the occurrence / non-occurrence status of , then you also know the status of
Β
Remark:
- is the set of subsets
- Trivial -field consists of only . Conditioning on is like conditioning on no information
- The power set is a -field, corresponding to βknowing everythingβ
- The information content from conditioning on a random variable is called the -field geerated by , denoted .
- written in simply probability theory can be interpreted as
- A random variable is said to be -measurable if
Β
Properties of Conditional Expectation
Assume are -field. are random variables.
- If is -measurable, then
- If is -measurable, then
- , for scalars
- If , then
- If is convex, then
Β
Measure-Theoretic Independence
- (-field) are independent iff for any and ,
- (random variables) are independent iff and are independent
Part 5: Moment Generating Functions
MGF of
which is a function of . Can be calculated using the Rule for the Lazy Statistician.
Calculate Moments
Applications
Uniqueness of MGF
If for all , then and have the same distribution.
- Note that two random variables can have matching moments, i.e., but have different distributions.
Β
Sum of Independent Random Variables
Suppose are independent random variables, and . Then
for all
Β
Establishing Convergence in Distribution
if as for all for some , then
e.g.
for , then
thus
The Central Limit Theorem
Suppose are i.i.d. and and both exist and are finite, then
where and
Equally inferences, when n is large
The Delta Method
Assume are such that
where , is a constant, and satisfies and . Then, assuming is a function that is differentiable at and , then
This is often applied with and
Equivalent description:
if is approximately and as , then is approximately
assuming that is a function that is differentiable at and .
Part 6: Classic Results
Chi-squared distribution
Chi-squared distribution with degrees of freedom is a special case of the Gamma distribution, which is . If follows such distribution
Β
t-distribution
t-distribution with degrees of freedom is defined as
where , is the Chi-squared distribution with freedom . And are independent.
t-distribution has a bell shape density, but has heavier tails than the normal distribution. As increases, the distribution converges to
When , the distribution is Cauchy distribution, which have very heavy tails. Its mean (expectation) does not exist.
- CLT do not work for Cauchy distribution
- are i.i.d. Cauchy β is also Cauchy distribution, no matter how large n is
Β
Classic Results
Assuming are i.i.d. , then
- and are independent
- The quantity
- The quantity
- The quantity
Note that
so is an unbiased estimator of .
Β
End of Content
Loading Comments...