Miscellaneous mathematical observations

2022-09-23 · # math

Gradient is orthogonal to contours
Positive definite implies Cauchy–Schwarz
証明は絶対
1. References

Gradient is orthogonal to contours

2022-09-23

Geometric intuition for the gradient.

Definition. For a scalar function

f: \R^n \to \R

we say a contour or level set of

f

is a parameterized curve

\bm{ r}(t): \R \to \R^n

such that for every point on the curve, the function value is the same, that is, there is a fixed constant

c \in \R

such that

\begin{aligned} f(\bm{ r}(t)) = c \end{aligned}

Claim. The gradient $\nabla f$ is orthogonal to the contour at every point.

Proof. Take the derivative of both sides of (1) with respect to $t$ .

\begin{aligned} \hphantom{ \frac{df}{dt} = \frac{df}{d r_1} \frac{d r_1}{dt} + \dotsb + \frac{df}{d r_n} \frac{d r_n}{dt} } \mathllap{f(\bm{ r}(t))} &= c \\ \frac{df}{dt} &= 0 \end{aligned}

Expanding using the multivariate chain rule,

\begin{aligned} \frac{df}{dt} = \frac{df}{d r_1} \frac{d r_1}{dt} + \dotsb + \frac{df}{d r_n} \frac{d r_n}{dt} &= 0 \\ \begin{pmatrix} \frac{df}{d r_1} & \cdots & \frac{df}{d r_n} \end{pmatrix} \begin{pmatrix} \frac{d r_1}{dt} \\ \vdots \\ \frac{d r_n}{dt} \end{pmatrix} &= 0 \\ \providecommand{\innerhelper}{ \@ifstar{\innerstar}{\innernostar} } \providecommand{\innernostar}[2]{\langle #1, #2 \rangle} \providecommand{\innerstar}[2]{\left \langle #1, #2 \right \rangle} \innerhelper{\nabla f(t)}{\bm{ r}'(t)} &= 0 \end{aligned}

which is what we wanted to show. $\square$

Thus the gradient is orthogonal to every tangent of the contour, so moving along the contour moves orthogonally to the gradient. This should make intuitive sense since the change in the function value after moving in the direction $\Delta \bm{ x}$ is approximately (to first order Taylor series approximation)

\begin{aligned} f(\bm{ x} + \Delta \bm{ x}) \approx f(\bm{ x}) + \nabla f(t)^\top \Delta \bm{ x} \end{aligned}

Therefore if we want the function value to not change (which is the definition of a contour), we want the direction of movement to be orthogonal to the gradient, which is precisely what we have shown happens.

Positive definite implies Cauchy–Schwarz

2022-09-23

A quick derivation of the Cauchy–Schwarz inequality.

Theorem. (Cauchy–Schwarz inequality). Every pair of vectors

u, v

satisfies

\begin{aligned} \providecommand{\innerhelper}{ \@ifstar{\innerstar}{\innernostar} } \providecommand{\innernostar}[2]{\langle #1, #2 \rangle} \providecommand{\innerstar}[2]{\left \langle #1, #2 \right \rangle} \innerhelper{u}{u} \providecommand{\innerhelper}{ \@ifstar{\innerstar}{\innernostar} } \providecommand{\innernostar}[2]{\langle #1, #2 \rangle} \providecommand{\innerstar}[2]{\left \langle #1, #2 \right \rangle} \innerhelper{v}{v} &\geq \providecommand{\innerhelper}{ \@ifstar{\innerstar}{\innernostar} } \providecommand{\innernostar}[2]{\langle #1, #2 \rangle} \providecommand{\innerstar}[2]{\left \langle #1, #2 \right \rangle} \innerhelper{u}{v}^2 \end{aligned}

Proof. For vectors

u, v \in \R^n

, let

V \coloneqq \begin{pmatrix} u & v \end{pmatrix} \in \R^{n \times 2}

be the matrix with

u

and

v

as columns.

\begin{aligned} \Theta \coloneqq V^{\top} V = \begin{pmatrix} \providecommand{\innerhelper}{ \@ifstar{\innerstar}{\innernostar} } \providecommand{\innernostar}[2]{\langle #1, #2 \rangle} \providecommand{\innerstar}[2]{\left \langle #1, #2 \right \rangle} \innerhelper{u}{u} & \providecommand{\innerhelper}{ \@ifstar{\innerstar}{\innernostar} } \providecommand{\innernostar}[2]{\langle #1, #2 \rangle} \providecommand{\innerstar}[2]{\left \langle #1, #2 \right \rangle} \innerhelper{u}{v} \\ \providecommand{\innerhelper}{ \@ifstar{\innerstar}{\innernostar} } \providecommand{\innernostar}[2]{\langle #1, #2 \rangle} \providecommand{\innerstar}[2]{\left \langle #1, #2 \right \rangle} \innerhelper{v}{u} & \providecommand{\innerhelper}{ \@ifstar{\innerstar}{\innernostar} } \providecommand{\innernostar}[2]{\langle #1, #2 \rangle} \providecommand{\innerstar}[2]{\left \langle #1, #2 \right \rangle} \innerhelper{v}{v} \end{pmatrix} & \\ \det(\Theta) = \providecommand{\innerhelper}{ \@ifstar{\innerstar}{\innernostar} } \providecommand{\innernostar}[2]{\langle #1, #2 \rangle} \providecommand{\innerstar}[2]{\left \langle #1, #2 \right \rangle} \innerhelper{u}{u} \providecommand{\innerhelper}{ \@ifstar{\innerstar}{\innernostar} } \providecommand{\innernostar}[2]{\langle #1, #2 \rangle} \providecommand{\innerstar}[2]{\left \langle #1, #2 \right \rangle} \innerhelper{v}{v} - \providecommand{\innerhelper}{ \@ifstar{\innerstar}{\innernostar} } \providecommand{\innernostar}[2]{\langle #1, #2 \rangle} \providecommand{\innerstar}[2]{\left \langle #1, #2 \right \rangle} \innerhelper{u}{v}^2 &\geq 0 \\ \implies \providecommand{\innerhelper}{ \@ifstar{\innerstar}{\innernostar} } \providecommand{\innernostar}[2]{\langle #1, #2 \rangle} \providecommand{\innerstar}[2]{\left \langle #1, #2 \right \rangle} \innerhelper{u}{u} \providecommand{\innerhelper}{ \@ifstar{\innerstar}{\innernostar} } \providecommand{\innernostar}[2]{\langle #1, #2 \rangle} \providecommand{\innerstar}[2]{\left \langle #1, #2 \right \rangle} \innerhelper{v}{v} &\geq \providecommand{\innerhelper}{ \@ifstar{\innerstar}{\innernostar} } \providecommand{\innernostar}[2]{\langle #1, #2 \rangle} \providecommand{\innerstar}[2]{\left \langle #1, #2 \right \rangle} \innerhelper{u}{v}^2 \end{aligned}

where we use that

\Theta

is symmetric positive-definite so its determinant is positive.

\square

A similar line of reasoning can be used to show that the variance is positive.

Theorem. For any random variable

X

, define its variance as

\begin{aligned} \providecommand{\Varhelper}{ \@ifstar{\Varstar}{\Varnostar} } \providecommand{\Varnostar}[1]{\mathbb{V}\text{ar}[#1]} \providecommand{\Varstar}[1]{\mathbb{V}\text{ar}\left[#1\right]} \Varhelper{X} \coloneqq \providecommand{\Ehelper}{ \@ifstar{\Estar}{\Enostar} } \providecommand{\Enostar}[1]{\mathbb{E}[#1]} \providecommand{\Estar}[1]{\mathbb{E}\left[#1\right]} \Ehelper{X^2} - \providecommand{\Ehelper}{ \@ifstar{\Estar}{\Enostar} } \providecommand{\Enostar}[1]{\mathbb{E}[#1]} \providecommand{\Estar}[1]{\mathbb{E}\left[#1\right]} \Ehelper{X}^2 \end{aligned}

Then

\providecommand{\Varhelper}{ \@ifstar{\Varstar}{\Varnostar} } \providecommand{\Varnostar}[1]{\mathbb{V}\text{ar}[#1]} \providecommand{\Varstar}[1]{\mathbb{V}\text{ar}\left[#1\right]} \Varhelper{X} \geq 0

The usual approach is to show that $\providecommand{\Varhelper}{ \@ifstar{\Varstar}{\Varnostar} } \providecommand{\Varnostar}[1]{\mathbb{V}\text{ar}[#1]} \providecommand{\Varstar}[1]{\mathbb{V}\text{ar}\left[#1\right]} \Varhelper{X} = \providecommand{\Ehelper}{ \@ifstar{\Estar}{\Enostar} } \providecommand{\Enostar}[1]{\mathbb{E}[#1]} \providecommand{\Estar}[1]{\mathbb{E}\left[#1\right]} \Ehelper{(X - \providecommand{\Ehelper}{ \@ifstar{\Estar}{\Enostar} } \providecommand{\Enostar}[1]{\mathbb{E}[#1]} \providecommand{\Estar}[1]{\mathbb{E}\left[#1\right]} \Ehelper{X})^2}$ and then argue that the expectation of a nonnegative quantity must be nonnegative. Alternatively, it follows directly from Jensen's inequality on the convex function $f(x) = x^2$ . We will take a different approach.

Lemma. For any random variable

X

, let

\mu_i \coloneqq \providecommand{\Ehelper}{ \@ifstar{\Estar}{\Enostar} } \providecommand{\Enostar}[1]{\mathbb{E}[#1]} \providecommand{\Estar}[1]{\mathbb{E}\left[#1\right]} \Ehelper{X^i}

be its

i

-th moment. Then the matrices collecting its moments up to order

n

\begin{aligned} M_r \coloneqq \begin{pmatrix} 1 & \mu_1 & \mu_2 & \cdots & \mu_r \\ \mu_1 & \mu_2 & \mu_3 & \cdots & \mu_{r + 1} \\ \mu_2 & \mu_3 & \mu_4 & \cdots & \mu_{r + 2} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ \mu_r & \mu_{r + 1} & \mu_{r + 2} & \cdots & \mu_{2 r} \\ \end{pmatrix} \end{aligned}

are all positive semi-definite for

r = 1, 2, \dotsc, \providecommand{\floorhelper}{ \@ifstar{\floorstar}{\floornostar} } \providecommand{\floornostar}[1]{\lfloor #1 \rfloor} \providecommand{\floorstar}[1]{\left \lfloor #1 \right \rfloor} \floorhelper{n/2}

Proof. Take the expectation of the squared polynomial with coefficients

a_0, \dotsc, a_r

\begin{aligned} \providecommand{\Ehelper}{ \@ifstar{\Estar}{\Enostar} } \providecommand{\Enostar}[1]{\mathbb{E}[#1]} \providecommand{\Estar}[1]{\mathbb{E}\left[#1\right]} \Ehelper{(a_0 + a_1 X + a_2 X^2 + \dotsb + a_r X^r)^2} \geq 0 \end{aligned}

Expanding the square directly, we have

\begin{aligned} \providecommand{\Ehelper}{ \@ifstar{\Estar}{\Enostar} } \providecommand{\Enostar}[1]{\mathbb{E}[#1]} \providecommand{\Estar}[1]{\mathbb{E}\left[#1\right]} \Ehelper*{\sum_{0 \leq i, j \leq r} a_i a_j X^{i + j}} = \sum_{0 \leq i, j \leq r} a_i a_j \providecommand{\Ehelper}{ \@ifstar{\Estar}{\Enostar} } \providecommand{\Enostar}[1]{\mathbb{E}[#1]} \providecommand{\Estar}[1]{\mathbb{E}\left[#1\right]} \Ehelper{X^{i + j}} = \sum_{0 \leq i, j \leq r} a_i a_j \mu_{i + j} \end{aligned}

where we use the linearity of expectation and the definition of moments. But this is precisely the quadratic form

\bm{ a}^{\top} M \bm{ a}

for

\bm{ a} \coloneqq (a_0, \dotsc, a_r)

and

M_{i, j} \coloneqq \mu_{i + j}

, matching the definition in (2). Since

\bm{ a}^{\top} M \bm{ a} \geq 0

holds for any

a \in \R^{r + 1}

M

is positive semi-definite by definition.

\square

Now to prove the nonnegativity of variance, we need only to apply the lemma for $r = 1$ .

Proof. Take

M_2 = \left ( \begin{smallmatrix} 1 & \mu_1 \\ \mu_1 & \mu_2 \end{smallmatrix} \right ) = \left ( \begin{smallmatrix} 1 & \providecommand{\Ehelper}{ \@ifstar{\Estar}{\Enostar} } \providecommand{\Enostar}[1]{\mathbb{E}[#1]} \providecommand{\Estar}[1]{\mathbb{E}\left[#1\right]} \Ehelper{X} \\ \providecommand{\Ehelper}{ \@ifstar{\Estar}{\Enostar} } \providecommand{\Enostar}[1]{\mathbb{E}[#1]} \providecommand{\Estar}[1]{\mathbb{E}\left[#1\right]} \Ehelper{X} & \providecommand{\Ehelper}{ \@ifstar{\Estar}{\Enostar} } \providecommand{\Enostar}[1]{\mathbb{E}[#1]} \providecommand{\Estar}[1]{\mathbb{E}\left[#1\right]} \Ehelper{X^2} \end{smallmatrix} \right )

. Since

M_2

is positive semi-definite its determinant is nonnegative, so we have

\det(M_2) = \providecommand{\Ehelper}{ \@ifstar{\Estar}{\Enostar} } \providecommand{\Enostar}[1]{\mathbb{E}[#1]} \providecommand{\Estar}[1]{\mathbb{E}\left[#1\right]} \Ehelper{X^2} - \providecommand{\Ehelper}{ \@ifstar{\Estar}{\Enostar} } \providecommand{\Enostar}[1]{\mathbb{E}[#1]} \providecommand{\Estar}[1]{\mathbb{E}\left[#1\right]} \Ehelper{X}^2 \geq 0

, showing

\providecommand{\Varhelper}{ \@ifstar{\Varstar}{\Varnostar} } \providecommand{\Varnostar}[1]{\mathbb{V}\text{ar}[#1]} \providecommand{\Varstar}[1]{\mathbb{V}\text{ar}\left[#1\right]} \Varhelper{X} \geq 0

\square

証明は絶対

2023-06-02

A joke transcribed from page 9 of 『数学女子: 1』 (著者名：安田まさえ) [1].

女は時間と金が
かかる
$\boxed{\sf 女 = 時 \times 金}$
「時は金なり」から
$\boxed{\sf 時= 金}$
これを代入して
$\boxed{ \begin{aligned} & \sf 女 = 金 \times 金 \\ & \sf \hphantom{女} = 金^2 \end{aligned}}$
「金は諸悪の根源」から
根源 $\Rightarrow \textsf{root} \Rightarrow \sqrt{\hphantom{x}}$
$\boxed{\sf 金= \sqrt{\hphantom{x}} 悪}$
これを先ほどの式に代入
$\boxed{ \begin{aligned} & \sf 女 = (\sqrt{\hphantom{x}} 悪)^2 \\ & \sf \therefore 女 = 悪 \end{aligned} }$

References

BibTeX

[1] 安田まさえ., 数学女子: 1. Tōkyō: 竹書房, 2010.

さみしいも、たのしい。Page source. Last updated: 2023-06-02.