Disintegration_theorem References

In mathematics, the disintegration theorem is a result in measure theory and probability theory. It rigorously defines the idea of a non-trivial "restriction" of a measure to a measure zero subset of the measure space in question. It is related to the existence of conditional probability measures. In a sense, "disintegration" is the opposite process to the construction of a product measure.

Motivation

Consider the unit square $S=[0,1]\times [0,1]$ in the Euclidean plane $\mathbb {R} ^{2}$ . Consider the probability measure $\mu$ defined on $S$ by the restriction of two-dimensional Lebesgue measure $\lambda ^{2}$ to $S$ . That is, the probability of an event $E\subseteq S$ is simply the area of $E$ . We assume $E$ is a measurable subset of $S$ .

Consider a one-dimensional subset of $S$ such as the line segment $L_{x}=\{x\}\times [0,1]$ . $L_{x}$ has $\mu$ -measure zero; every subset of $L_{x}$ is a $\mu$ - null set; since the Lebesgue measure space is a complete measure space,

E\subseteq L_{x}\implies \mu (E)=0.

While true, this is somewhat unsatisfying. It would be nice to say that $\mu$ "restricted to" $L_{x}$ is the one-dimensional Lebesgue measure $\lambda ^{1}$ , rather than the zero measure. The probability of a "two-dimensional" event $E$ could then be obtained as an integral of the one-dimensional probabilities of the vertical "slices" $E\cap L_{x}$ : more formally, if $\mu _{x}$ denotes one-dimensional Lebesgue measure on $L_{x}$ , then

\mu (E)=\int _{[0,1]}\mu _{x}(E\cap L_{x})\,\mathrm {d} x

for any "nice"

E\subseteq S

. The disintegration theorem makes this argument rigorous in the context of measures on metric spaces.

Statement of the theorem

(Hereafter, ${\mathcal {P}}(X)$ will denote the collection of Borel probability measures on a topological space $(X,T)$ .) The assumptions of the theorem are as follows:

Let $Y$ and $X$ be two Radon spaces (i.e. a topological space such that every Borel probability measure on it is inner regular, e.g. separably metrizable spaces; in particular, every probability measure on it is outright a Radon measure).
Let $\mu \in {\mathcal {P}}(Y)$ .
Let $\pi :Y\to X$ be a Borel- measurable function. Here one should think of $\pi$ as a function to "disintegrate" $Y$ , in the sense of partitioning $Y$ into $\{\pi ^{-1}(x)\ |\ x\in X\}$ . For example, for the motivating example above, one can define $\pi ((a,b))=a$ , $(a,b)\in [0,1]\times [0,1]$ , which gives that $\pi ^{-1}(a)=a\times [0,1]$ , a slice we want to capture.
Let $\nu \in {\mathcal {P}}(X)$ be the pushforward measure $\nu =\pi _{*}(\mu )=\mu \circ \pi ^{-1}$ . This measure provides the distribution of $x$ (which corresponds to the events $\pi ^{-1}(x)$ ).

The conclusion of the theorem: There exists a $\nu$ - almost everywhere uniquely determined family of probability measures $\{\mu _{x}\}_{x\in X}\subseteq {\mathcal {P}}(Y)$ , which provides a "disintegration" of $\mu$ into $\{\mu _{x}\}_{x\in X}$ , such that:

the function $x\mapsto \mu _{x}$ is Borel measurable, in the sense that $x\mapsto \mu _{x}(B)$ is a Borel-measurable function for each Borel-measurable set $B\subseteq Y$ ;
$\mu _{x}$ "lives on" the fiber $\pi ^{-1}(x)$ : for $\nu$ - almost all $x\in X$ , $\mu _{x}\left(Y\setminus \pi ^{-1}(x)\right)=0,$ and so $\mu _{x}(E)=\mu _{x}(E\cap \pi ^{-1}(x))$ ;
for every Borel-measurable function $f:Y\to [0,\infty ]$ , $\int _{Y}f(y)\,\mathrm {d} \mu (y)=\int _{X}\int _{\pi ^{-1}(x)}f(y)\,\mathrm {d} \mu _{x}(y)\,\mathrm {d} \nu (x).$ In particular, for any event $E\subseteq Y$ , taking $f$ to be the indicator function of $E$ ,^[1] $\mu (E)=\int _{X}\mu _{x}(E)\,\mathrm {d} \nu (x).$

Applications

Product spaces

The original example was a special case of the problem of product spaces, to which the disintegration theorem applies.

When $Y$ is written as a Cartesian product $Y=X_{1}\times X_{2}$ and $\pi _{i}:Y\to X_{i}$ is the natural projection, then each fibre $\pi _{1}^{-1}(x_{1})$ can be canonically identified with $X_{2}$ and there exists a Borel family of probability measures $\{\mu _{x_{1}}\}_{x_{1}\in X_{1}}$ in ${\mathcal {P}}(X_{2})$ (which is $(\pi _{1})_{*}(\mu )$ -almost everywhere uniquely determined) such that

\mu =\int _{X_{1}}\mu _{x_{1}}\,\mu \left(\pi _{1}^{-1}(\mathrm {d} x_{1})\right)=\int _{X_{1}}\mu _{x_{1}}\,\mathrm {d} (\pi _{1})_{*}(\mu )(x_{1}),

which is in particular^{[
clarification needed]}

\int _{X_{1}\times X_{2}}f(x_{1},x_{2})\,\mu (\mathrm {d} x_{1},\mathrm {d} x_{2})=\int _{X_{1}}\left(\int _{X_{2}}f(x_{1},x_{2})\mu (\mathrm {d} x_{2}\mid x_{1})\right)\mu \left(\pi _{1}^{-1}(\mathrm {d} x_{1})\right)

and

\mu (A\times B)=\int _{A}\mu \left(B\mid x_{1}\right)\,\mu \left(\pi _{1}^{-1}(\mathrm {d} x_{1})\right).

The relation to conditional expectation is given by the identities

\operatorname {E} (f\mid \pi _{1})(x_{1})=\int _{X_{2}}f(x_{1},x_{2})\mu (\mathrm {d} x_{2}\mid x_{1}),

\mu (A\times B\mid \pi _{1})(x_{1})=1_{A}(x_{1})\cdot \mu (B\mid x_{1}).

Vector calculus

The disintegration theorem can also be seen as justifying the use of a "restricted" measure in vector calculus. For instance, in Stokes' theorem as applied to a vector field flowing through a compact surface $\Sigma \subset \mathbb {R} ^{3}$ , it is implicit that the "correct" measure on $\Sigma$ is the disintegration of three-dimensional Lebesgue measure $\lambda ^{3}$ on $\Sigma$ , and that the disintegration of this measure on ∂Σ is the same as the disintegration of $\lambda ^{3}$ on $\partial \Sigma$ .^[2]

Conditional distributions

The disintegration theorem can be applied to give a rigorous treatment of conditional probability distributions in statistics, while avoiding purely abstract formulations of conditional probability.^[3]

References

^ Dellacherie, C.; Meyer, P.-A. (1978). Probabilities and Potential. North-Holland Mathematics Studies. Amsterdam: North-Holland. ISBN 0-7204-0701-X.
^ Ambrosio, L.; Gigli, N.; Savaré, G. (2005). Gradient Flows in Metric Spaces and in the Space of Probability Measures. ETH Zürich, Birkhäuser Verlag, Basel. ISBN 978-3-7643-2428-5.
^ Chang, J.T.; Pollard, D. (1997). "Conditioning as disintegration" (PDF). Statistica Neerlandica. 51 (3): 287. CiteSeerX 10.1.1.55.7544. doi: 10.1111/1467-9574.00056. S2CID 16749932.

[Dellacherie_Meyer-1] Dellacherie, C.; Meyer, P.-A. (1978). Probabilities and Potential. North-Holland Mathematics Studies. Amsterdam: North-Holland. ISBN 0-7204-0701-X.

[Ambrosio_Gigli_Savare-2] Ambrosio, L.; Gigli, N.; Savaré, G. (2005). Gradient Flows in Metric Spaces and in the Space of Probability Measures. ETH Zürich, Birkhäuser Verlag, Basel. ISBN 978-3-7643-2428-5.

[Chang_Pollard-3] Chang, J.T.; Pollard, D. (1997). "Conditioning as disintegration" (PDF). Statistica Neerlandica. 51 (3): 287. CiteSeerX 10.1.1.55.7544. doi: 10.1111/1467-9574.00056. S2CID 16749932.

[1]

[2]

[3]