Intuitively a measure assigns a number to things like line segments, areas and volumes.
Weakness of Riemann integral idea
Weak to functions with unlimited oscillation.
Lebesgue integration idea
The idea behind the Lebesgue integral is that it is invariant under a change on a set with measure 0.
Probability and measure idea
Probability theory can be seen as a special case of measure theory: Probability function is a measure. Expectation (given a probability function) is an integral wrt. a measure.
σ-algebra
σ-algebra of \(\mathcal{E}\) over \(E\)
Let \(\mathcal{E} \subseteq{} \PowsP{} E\). Then \(\mathcal{E}\) is a σ-algebra if it satisfies:
1) \(\mathcal{E}\) is nonempty.
2) \(\mathcal{E}\) contains the total set \(E\).
3) \(\mathcal{E}\) is closed under countable union.
4) \(\mathcal{E}\) is closed under complement.
2) replaceable by \(\emptyset\) thanks to 4) through complement.
3) replaceable by countable intersection thanks to complement and de Morgan's law.
Sigma algebra idea
Formalizes the idea that pieces of measurable sets each contribe a set "volume".
Measurable set
An element of a sigma algebra.
Trivial σ-algebra
\(\Br{\emptyset , E}\); The smallest σ-algebra over \(E\).
Power set
\(\PowsP{E}\); The largest σ-algebra over \(E\).
σ-algebra generated by \(F \subset{} E\)
\(\sigma (F) = \Br{\emptyset , F, F^c, E}\)
Sub-σ-algebra
σ-algebra coarseness idea
The bigger (or finer) a σ-algebra is, the more sets it is possible to measure.
σ-algebra generated by subsets
The smallest sigma algebra containing the subsets. Always exists.
Intersection of σ-algebras
The intersection of σ-algebras over \(E\) is a σ-algebra.
Borel σ-algebra on \(\mathbb{R}\)
\(\mathcal{B} \mathbb{R}\), Generated by all \([a..b]\).
Borel σ-algebra on \(\mathbb{R}\)-subset
Collection of all intersections of the interval with all sigma algebra elements.
Measurable space
\(E\) with σ-algebra \(\mathcal{E}\) is a measurable space.
Measure
Measure of empty is zero and σ-additivity
σ-additivity
Measure of union of partition pieces equals sum of measures of partition peaces.
Finite additivity
Monotonicity
σ-subadditivity
If not disjoint, the equation is less than or equal.
Proposition
Proof
Continuity of measures
Proof
Characterization of measures
Remark
Finite measure
σ-finite measure
Remark
Trivial measure
Borel measures
Lebesgue measure
Lebesgue-Stieltjes measures
Dirac measure
Counting measures
Angular measure
Probability measures
Measure space
Example
Probability space
\(\mathcal{M}_{\Omega , A, \P}\), the measure space of events over outcomes with a probability measure.
Negligible set
A negligible set is a subset of a measurable set with measure zero.
Complete measure
Example
Complete measure space
Remark
Example
\(\mathcal{L} (\R )\) · Lebesgue σ-algebra
The Cantor set
Product measure
Remark
Remark
Lebesgue measure in \(\mathbb{R}^d\)
Example
Measurable
\(f\) is \(\mathcal{E} \to{} \mathcal{F}\)-measurable if \(\forall B \in{} \mathcal{F} f^{-1}B \in{} \mathcal{E}\).
"Every measurable set in \(\mathcal{F}\) was measurable in \(\mathcal{E}\)."
Set indicator function
Example
Measurability via generating set
Let \(\mathcal{G}\) generate \(\mathcal{F}\).
Then \(f\) is measurable iff \(\underset{G \in{} \mathcal{G}}\forall{} f^{-1}G \in{} \mathcal{E}\)
"We just need to check if the generating set was measurable."
Example
Borel measurable
\(f : \MeasurableSpace{E}{\mathcal{E}} \to{} \R\) is Borel measurable if \(f : \MeasurableSpace{E}{\mathcal{E}} \to{} \MeasurableSpace\R{\mathcal{B} \R}\) is measurable.
Corollary
\(f : E \to{} \R\) is Borel measurable iff the preimage of every \((\infty ..a]\) is measurable.
σ-algebra generated by a function
Let \(f : E \to{} F\) and \(\mathcal{F}\) a σ-algebra.
\(\sigma (f)\) is the σ-algebra generated by \(f\) on \(E\) which makes \(f\) measurable.
The σ-algebra generated by the preimages of the measurable sets.
Measurability preserving operations
Linear combination: \(af + bg\)
Multiplication: \(fg\)
Division: \(\frac{f}{g}\) (\(g \ne{} 0\))
TODO
TODO
Measurable composition is measurable
Let \(E, F, G\) be measurable spaces, and \(f, g\) measurable where \(\operatorname{Img} f \subseteq{} \operatorname{Dmn} g\).
Then \(g \circ{} f\) is measurable.
Proof
Almost everywhere
A property \(P(x)\) holds almost everywhere if it is only false on a negligible set.
Almost surely
Almost everywhere in the context of a probability space.
A property \(P(x)\) holds almost surely if it is only false within an impossible event.
Push-forward measure
Let \(E = \MeasureSpace{E}{\mathcal{E}}\mu\), \(F = \MeasurableSpace{F}{\mathcal{F}}\) and \(f : E \to{} F\) a measurable map.
\(\mu^* = \mu{} \circ{} f^{-1}\) is the pushforward measure via \(f\) (Sometimes denoted \(f\hash\mu\)).
Simple function
Integral of simple function
Intuitively: each number is multiplied by the "size" of its partition. Then they are summed together.
Approximation by simple functions
A non-negative measurable function can be written as the pointwise limit of an increasing sequence of simple functions.
Let \(E = \MeasureSpace{E}{\mathcal{E}}\mu\) and \(f : E \to{} \R\) a non-negative measurable function.
\(\exists\) a sequence of simple functions \(f_\underline{n}\)
Remark
Contour partition function
A simple function defined on \(n\) contours of \(f\).
"\(\nu\) measures at least the zeroes of \(\mu\)."
Equivalent measures
\(\mu\) and \(\nu\) are equivalent if \(\mu{} \ll{} \nu\) and \(\nu{} \ll{} \mu\).
Example
Radon-Nikodym
Ran
Radon-Nikodym derivative
The Radon-Nikodym derivative is the function \(f\) in the Radon-Nikodym theorem.
Notation: \(f = \frac{d\nu}{d\mu}\).
Remark
Exercise
Exercise
Exercise
Exercise
Exercise
Exercise
Exercise
Exercise
Exercise
Exercise
Exercise
Exercise
Exercise
Exercise
Exercise
Exercise
Exercise
Exercise
Exercise
Exercise
Exercise
Exercise
Exercise
Exercise
Exercise
Exercise
Exercise
Exercise
Axiom
Trial · Experiment
Random trial
Bernoulli trial
Sample space
Example
Power set
Example
Event
Example
Exercise
Exercise
σ-algebra of events
Remark
Trivial σ-algebra
Power set
σ-algebra generated by event
Borel σ-algebra on \(\R\)
Borel σ-algebra on \(\R^d\)
Examples
σ-algebra generated by a set
Remark
Borel σ-algebra
Proof
Remark
Remark
Exercise
Exercise
Exercise
Exercise
Exercise
Exercise
Exercise
Exercise
Probability measure
σ-additivity
Finite additivity
Remark
Remark
Finite additivity
Proof
Corollary
Proof
σ-additivity equivalences
σ-additivity is equivalent to the probability of increasing and decreasing set sequences converging to the probability.
Proof
Exercise
Conditional probability
Independence · Mutual independence
Pairwise independence
Remark
Complement independences
Proof
Dependent & independent examples
Conditional probability
...
Theorem
Conditional probability measure
Proof
Chain rule
Proof
Partition
Finite partition
Countable partition
Law of total probability
Proof
Bayes' theorem
Proof
Event sequences
In a probability space, a sequence of measurable sets corresponds to a sequence of events.
\(\LimSup{} A_n\) can be seen as the collection of outcomes that occur in infinitely many of the \(A_n\). It is also the event that infinitely many of the \(A_n\) occur.
\(\LimSup{} A_n\) can be seen as the collection of outcomes that occur in every \(A_n\) except a finite subset. It can also be seen as the event that infinitely many of the events in \(A_n\) occur.
Borel-Cantelli idea
Tells us about the probability of the event \(\LimSup{} A_n\), given the probabilities of \(A_n\).
The first Borel-Cantelli lemma says that
The Borel-Cantelli lemmas relates the probability of \(F_i\) to the probability of this \(\LimSup{} F_i\).
First Borel-Cantelli lemma
...
Let \(A_\underline{n}\) be a sequence of events and \(\sum_n^{1:}\infty\\ \P{} A_n\) is finite.
\[\P{} \LimSup{} A_n = 0\]
Proof
Second Borel-Cantelli lemma
Proof
Exercise
Exercise
Exercise
Exercise
Exercise
Exercise
Exercise
Exercise
Exercise
Exercise
Exercise
Exercise
Exercise
Exercise
Exercise
Exercise
Random variable motivation
Random variable · Real random variable
A random variable is a map \(Ω \to{} \R\) from a probability space to \(\R\) where \({ω \in{} Ω : X(ω) \le{} x}\) is an event. "The collection of outcomes assigned to \(x\) or below is an event."
Alternatively, \(X^{-1}(-\infty ..x]\). Recall that this is a generator of all the Borel sets.
Random variable name
A random variable is in reality a function. However, it is very common to omit any mentions of \(Ω\) and \(ω\) and treat them as implicit. Doing so, \(X = X(ω)\) looks like a variable.
Probability measure pushforward
The fact that \(X^{-1}(-\infty ..x]\) is an event means that \(X\) is a measurable map. This gives us a pushforward measure of \(\P\) on \(\R\).
Random variables
Not random variables
Generalized random variable
A random variable but with a range \(\R{} \cup{} \{-\infty , \infty \}\).
Random vector
A random vector of dimension \(n\) is a function \(X : Ω \to{} \R^n\) with \(X(ω) = [X_1(ω), \dots , X_n(ω)]\).
Stochastic process
A stochastic process is a collection \(X_\underline{n}\) of random variables.
Exercise
Example
Law
The law of a random variable \(X\) is the pushforward measure of \(\P\) that acts on Borel sets.
\(μ_X = \P{} \circ{} X^{-1}\).
Distribution · CDF · Cumulative distribution function
The distribution of a random variable is the function \(\R{} \to{} [0..1]\) of a real number \(x\) that evaluates the law measure for the Borel set \((-\infty ..x]\).
The distribution of a random variable is the function \(F_X\) defined by \(F_X (x) = μ_X (-\infty ..x] = \P [ X \le{} x ]\)
Law & CDF equivalence
The law and the CDF of a random variable are equivalent. Given one, the other can be constructed.
Distribution properties
The distribution function \(F_X\) of \(X\) satisfies:
\(F_X\) is an increasing function (not necessarily strictly).
\(F_X\) is right-continuous. (because of the \(\le\))
Proof
Random variable via CDF
A function satisfying the CDF properties can serve as a CDF. This will then induce a law and a probability measure on \(Ω\).
Exercise
Discrete random variable
\(F_X\) is a step function, or equivalently, the image of \(F_X\) is countable.
Continuous random variable
Mixed random variable
Absolutely continuous random variable
\(X\) is absolutely continuous if \(F_X\) can be written as \(F_X (x) = \int_{-\infty}^x f_X (y) dy\) for a non-negative integrable function \(f_X\).
Absolutely continuous function
Density function
The density function of \(X\) is the function \(f_X\) if it exists in the definition of absolutely continuous random variable above.
Integral of density
The integral of the density on \(\R\) must be \(1\). This follows from the fact that \(\Lim{x \to{} \infty} F_X (x) = 1\).
CDF Lebesgue derivative
By Lebesgue's differentiation theorem, \(F'_X (x) = f_X (x)\) and \(f_X\) is unique almost everywhere.
Non-absolutely continuous function
Examples of different distribution functions
Change of variables in Lebesgue integral
Random variable under diffeomorphism
Proof
Corollary
Proof
Remark
Exercise
Exercise
Exercise
Exercise
Exercise
Exercise
Expectation
Let \(Ω = \MeasureSpace{Ω}{\mathcal{A}}\P\) be a probability space and \(X\) a random variable.
The expectation of \(X\) is \(\mathbb{E} [X] = \int_Ω X d\P\).
Remark
Example
Example
Example
Law of the unconscious statistician
\[\E{} [g(X)] = \int_\R{} g dF_X\]
The integral is a Riemann-Stieltjes or Lebesgue-Stieltjes integral with respect to the increasing function \(F_X\), which induces a Lebesgue-Stieltjes measure.
When \(F_X\) has a non-zero derivative almost everywhere, then \(\E{} [g(X)] = \int_\R{} g(x) f_X (x) dx\)