Matrices

Rectangular arrays of numbers.

Matrices are rectangular arrays of number with which we can carried complicated algebraic operations.

Matrices

A matrix is simply a rectangular array of numbers.

E.g., \[ \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \end{bmatrix} \]

A matrix that has $m$ rows and $n$ columns is called a $m \times n$ matrix.

That is, we always follow the "rows-by-columns" convention when describing the size (dimension) of a matrix.

Subscripts of entries

When using symbols to represent entries in a matrix, we write something like \[ \begin{bmatrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \end{bmatrix} \]

Note the ordering in the subscripts: $a_{12}$ and $a_{21}$ represents two different entries.

Square matrices

If the number of rows and columns in a matrix are the same, i.e., a $n \times n$ matrix, then we call this matrix a square matrix.

E.g., \[ \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \end{bmatrix} \]

Row/column vectors are matrices too

As far as algebra is concerned, matrices of only one column behave just like column vectors.

E.g., \[ \begin{bmatrix} 1 \\ 2 \\ 3 \end{bmatrix} \]

Similarly, matrices of only one row behave just like row vectors.

E.g., \[ \begin{bmatrix} 1 & 2 & 3 \end{bmatrix} \]

Adding two matrices together

We can add two matrices of the same size together simply by adding the corresponding entries.

\[ \begin{bmatrix} a & b \\ c & d \end{bmatrix} + \begin{bmatrix} x & y \\ z & w \end{bmatrix} = \begin{bmatrix} a+x & b+y \\ c+z & d+w \end{bmatrix}. \]

We do not allow matrices of difference sizes to be added together.

This definition is consistent with the way we define vector sums.

The zero matrix

The zero matrix, of any size, is the matrix consists of zero entries.

E.g., a $2 \times 3$ matrix is \[ \begin{bmatrix} 0 & 0 & 0 \\ 0 & 0 & 0 \end{bmatrix}. \]

In general, a zero matrix is (dimension usually clear from context) \[ \mathbf{0} = \begin{bmatrix} 0 & \cdots & 0 \\ \vdots & \ddots & \vdots \\ 0 & \cdots & 0 \end{bmatrix}. \]

Sum of the zero matrix with any matrix $A$ of the same size is still $A$: \[ \mathbf{0} + A = A + \mathbf{0} = A \] So it really behaves like the number "0" in the world of matrices.

Scalar multiples of matrices

E.g., for any (real) number $r$, \[ r \cdot \begin{bmatrix} a & b \\ c & d \end{bmatrix} = \begin{bmatrix} ra & rb \\ rc & rd \end{bmatrix}. \]

In general, for a scalar $r$, \[ r \cdot \begin{bmatrix} a_{11} & \cdots & a_{1n} \\ \vdots & \ddots & \vdots \\ a_{m1} & \cdots & a_{mn} \end{bmatrix} = \begin{bmatrix} r \, a_{11} & \cdots & r \, a_{1n} \\ \vdots & \ddots & \vdots \\ r \, a_{m1} & \cdots & r \, a_{mn} \end{bmatrix} \]

$A$ and $rA$ always have the same size (for any scalar $r$).

As expected, $-A$ simply means $(-1)A$.

Matrix transpose

This is an operation that reflect entries of a matrix along the main diagonal (upper left to lower right).

\[ \begin{bmatrix} a & b \\ c & d \end{bmatrix}^\top \;=\; \begin{bmatrix} a & c \\ b & d \end{bmatrix}. \]

Similarly, \[ \begin{bmatrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ \end{bmatrix}^\top \;=\; \begin{bmatrix} a_{11} & a_{21} \\ a_{12} & a_{22} \\ a_{13} & a_{23} \\ \end{bmatrix}. \]

This operation simply turns rows into columns.

In general, the transpose of an $m \times n$ matrix $A$ is an $n \times m$ matrix denoted by $A^\top$, with $[ a_{ij} ]^\top = [ a_{ji} ]$. Clearly, $(A^\top)^\top = A$.

Matrix-vector product

We can also define the product of a $2 \times 2$ matrix and a vector in $\mathbb{R}^2$.

\[ \begin{bmatrix} a & b \\ c & d \end{bmatrix} \, \begin{bmatrix} x \\ y \end{bmatrix} \;=\; \begin{bmatrix} ax + by \\ cx + dy \end{bmatrix}. \]

The two resulting entries are exactly the dot product between the two rows and the vector, respectively.

Compute the matrix-vector product \[ \begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix} \, \begin{bmatrix} x \\ y \end{bmatrix} \]

Matrix-vector product

In general, we can multiply an $m \times n$ matrix (left) with a column vector in $\mathbb{R}^n$ (right) via the formula \[ \begin{bmatrix} a_{11} & \cdots & a_{1n} \\ \vdots & \ddots & \vdots \\ a_{m1} & \cdots & a_{mn} \end{bmatrix} \begin{bmatrix} x_1 \\ \vdots \\ x_n \end{bmatrix} = \begin{bmatrix} a_{11} x_1 + \cdots + a_{1n} x_n \\ \vdots \\ a_{m1} x_1 + \cdots + a_{mn} x_n \\ \end{bmatrix} \]

The result is a column vector in $\mathbb{R}^m$.

And the entries are exactly the dot products between the rows of the matrix and the vector.

This multiplication is only possible when the number of columns the matrix (left) has matches the number of entries in the vector (right).

Linearity

For a $m \times n$ matrix $A$, a vector $\mathbf{v} \in \mathbb{R}^n$, and a real number $r$, it is easy to verify that \[ A (r \, \mathbf{v}) = r \, A \, \mathbf{v}. \] (The entries of the resulting vector are dot products)

Similarly, for a $m \times n$ matrix $A$ and two vectors $\mathbf{u}, \mathbf{v} \in \mathbb{R}^n$, we can also verify that \[ A (\mathbf{u} + \mathbf{v}) = A \, \mathbf{u} \, + \, A \, \mathbf{v} \]

Therefore, the function $\mathbf{v} \mapsto A \mathbf{v}$ is a linear function.

Indeed, we will show, later, that all linear functions between $\mathbb{R}^n$ to $\mathbb{R}^m$ can be represented by matrices-vector products in this way.

More algebraic properties

It is also easy to verify that for two $m \times n$ matrices $A$ and $B$ and a vector $\mathbf{v} \in \mathbb{R}^n$, \[ (A+B) \mathbf{v} = A \mathbf{v} + B \mathbf{v}. \]

Similarly, \[ (-A) \mathbf{v} = -(A \mathbf{v}) = A(-\mathbf{v}). \]

Non-commutativity

In a matrix-vector product, the order in which we write the factors is very important: For a $m \times n$ matrices $A$ and a vector $\mathbf{v} \in \mathbb{R}^n$, \[ A \mathbf{x} \] makes sense (as we have defined). However, \[ \mathbf{x} A \] does not make sense.

Can you see why $\mathbf{x} A$ does not make sense?

Identity matrix

The $n \times n$ identity matrix, denoted $I_n$, is the matrix \[ I_n = \begin{bmatrix} 1 & & \\ & \ddots & \\ & & 1 \end{bmatrix}. \] (missing entries are $0$'s)

That is, it has $1$'s on the main diagonal and $0$'s elsewhere.

It has the very special property that \[ I_n \, \mathbf{v} = \mathbf{v} \] for any $\mathbf{v} \in \mathbb{R}^n$. It plays the role of "1", in matrix-vector products.

Whenever the dimension is clear from the context, we simply use $I = I_n$, an it is always assumed to be square.

Connections to linear functions

Matrices are nice containers for data (they look like spreadsheets).

But their real usefulness: Connection to linear functions.

As noted earlier, each $m \times n$ matrix $A$ defines a linear function $f : \mathbb{R}^n \to \mathbb{R}^m$ given by \[ f(\mathbf{x}) = A \mathbf{x}. \]

In mathematics, the terms function, transformation, and map have similar meaning and are often used in inconsistent ways. So you will often see people use "linear transformation" or "linear map" instead.

All linear function

The converse is also true: For every linear function $f : \mathbb{R}^n \to \mathbb{R}^m$, there exists a unique $m \times n$ matrix $A$ such that \[ f(\mathbf{x}) = A \mathbf{x}. \]

This statement falls apart when we enter infinite dimensional vector spaces.

Prove this statement. (Start with $n=2$ and $m=3$)

That is, every linear function is associated with a unique matrix, and every matrix is associated with a unique linear function. Mathematicians call such special one-to-one correspondence a bijection.

There is no difference between working with linear functions (important in applications) and working with matrices (finite representation).

Write down a matrix that defines the linear function that rotates the plane ($\mathbb{R}^2$) counterclockwise by 90 degrees.