Eigendecompositions

Eigenvalues, eigenvectors, eigendecomposition, and many other eigen-things

Eigenvalues, eigenvectors, and eigendecompositions show up in many surprising places in applications.

Eigenvalues, revisited

For a square matrix $A$, a complex scalar $\lambda$ is an eigenvalue of $A$ if \[ A - \lambda I \] is singular.

An $n \times n$ matrix has at most $n$ eigenvalues. Indeed, the eigenvalues are exactly the roots of the "minimal polynomial" of $A$.

For a square matrix $A$, its minimal polynomial is the polynomial $P$ of least degree and leading coefficient $1$ such that \[ P(A) = \mathbf{0}. \]

The degree of the minimal polynomial of an $n \times n$ matrix is at most $n$.

Direct computation of such a polynomial is rarely needed in applications and numerical computations. Its existence is what's important: It tell us how many eigenvalues we could have and how they are related.

Eigenspace and rank

From the definition, we observed that $A$ is singular if and only if $0$ is an eigenvalue of $A$.

Indeed, we can get rank information from the dimension of corresponding eigenspace.

For an $n \times n$ matrix $A$, the nullity equals to the dimension of the eigenspace associated with its zero eigenvalue.

Recall that this dimension is called the geometric multiplicity of the zero eigenvalue. This multiplicity is 0 if 0 is not an eigenvalue of $A$.

So, by the Rank-Nullity Theorem, \[ \operatorname{rank} A = n - \dim(E_A(0)) \] where $E_A(0)$ is the dimension of eigenspace associated with the zero eigenvalue of $A$, i.e., its geometric multiplicity.

Eigenvalues and numerical rank

Indeed, eigenvalues and their geometric multiplicities provide something even better.

Consider the matrix \[ A = \begin{bmatrix} 1 & 2 \\ 1 & 1.999999 \end{bmatrix} \]

We can see $\operatorname{rank} A = 2$.

However, the two rows are very close to being identical, i.e., $A$ is extremely close to being of rank 1.

The concept of rank cannot to capture such subtlety. (The rank of $A$ cannot be close to but not equal to 2)

Eigenvalues can clarify this: The eigenvalues of $A$ are \begin{align*} &2.99999933333 \\ -&0.00000033333 \end{align*} and the geometric multiplicity of the second eigenvalue is 1.

This "nearly zero" eigenvalue signals that $A$ is "nearly" of rank 1, and its magnitude gives us a way to quantify this statement.

Computing eigenvalues

We saw a simple trick that for \[ A = \begin{bmatrix} a & b \\ c & d \end{bmatrix} \]

its eigenvalues $\lambda_1,\lambda_2$ satisfy \begin{align*} \lambda_1 \lambda_2 &= ad - bc \\ \lambda_1 + \lambda_2 &= a + d \end{align*}

So how do people compute eigenvalues for larger matrices?

In classroom: In nearly 100% of the standard textbooks, students are taught to compute the eigenvalues of an $n \times n$ matrix $A$ by solving the polynomial equation \[ p_A (\lambda) = \det(\lambda I - A) = 0 \]

This technique is useless for all but the most trivial cases.

In real world: This is a multi-billion dollar business. Trillions?

  • Power methods
  • QR algorithm (not QR-decomposition)
  • ......

Computing eigenvalues of large matrices efficiently and accurately is an important and difficult problem in computational mathematics.

Eigenvectors, revisited

For an $n \times n$ matrix $A$ and an eigenvalue $\lambda$ of $A$, an eigenvector of $A$ associated with $\lambda$ is a vector $\mathbf{v} \in \mathbb{R}^n$ such that \[ A \mathbf{v} = \lambda \mathbf{v} \]

That is, $\mathbf{v}$ satisfies \[ (A \mathbf{v} - \lambda I) \mathbf{v} = \mathbf{0} \] That is, $\mathbf{v}$ is in the null space of the square matrix $A - \lambda I$.

There is really no difference between "an eigenvector of $A$ associated with $\lambda$" and "a nonzero null vector of the matrix $A - \lambda I$".

Consequently, finding the eigenvectors associated with $\lambda$ is equivalent to the problem of finding the null space of \[ A - \lambda I \]

In practice, eigenvalues and eigenvectors are usually computed at the same time through the same process.

Eigendecompositions

Matrix decomposition is the process of reducing matrices into certain simple canonical forms. They are important tools in linear algebra and its applications in many fields.

"Eigenvalue decomposition" a.k.a. "eigendecomposition", which reduces a matrix into a product of a diagonal matrix formed by eigenvalues and invertible matrices created from the eigenvectors. In the area of differentiation equation, it is the key for solving system of linear ordinary differential equations. In classical mechanics, eigenvalue decomposition is used to study natural mode of vibrations. In machine learning, it plays an important role in the principal component analysis method.

Eigendecompsition: a 2x2 example

Consider the $2 \times 2$ matrix \[ A = \begin{bmatrix} 8 & 10 \\ -3 & -3 \end{bmatrix} \]

Its eigenvalues are $\lambda_1 = 3, \lambda_2 = 2$.

We can find two eigenvectors \[ \begin{bmatrix} -2 \\ 1 \end{bmatrix} , \begin{bmatrix} -5/3 \\ 1 \end{bmatrix}. \] corresponding to $\lambda_1,\lambda_2$.

Observe that these two eigenvectors are linearly independent, and \[ A = \left[ \begin{smallmatrix} -2 & -5/3 \\ 1 & 1 \end{smallmatrix} \right] \left[ \begin{smallmatrix} \lambda_1 & 0 \\ 0 & \lambda_2 \end{smallmatrix} \right] \left[ \begin{smallmatrix} -2 & -5/3 \\ 1 & 1 \end{smallmatrix} \right]^{-1} \] This is an "eigendecomposition" of $A$.

Eigendecompositions

For an $n \times n$ matrix $A$, if there are $n$ linearly independent eigenvectors $\{ \mathbf{v}_1, \dots, \mathbf{v}_n \}$ associated with all eigenvalues of $A$, then $A$ can be factorized as \[ \begin{aligned} A &= X \Lambda X^{-1} &&\text{where} & X &= [\; \mathbf{v}_1 \; \cdots \; \mathbf{v}_n \;] \end{aligned} \] and $\Lambda$ is the diagonal matrix whose diagonal entries are the eigenvalues of $A$ corresponding to the eigenvectors in $X$.

This an eigendecomposition (eigenvalue decomposition) of $A$. Such a decomposition is not unique (it depends on your choice of $X$).