Matrix-matrix products

Multiplying matrices together

As a natural extension of matrix-vector product, we now develop the rules for multiplying matrices together.

Review: Matrix-vector product

We saw how matrix-vector product is defined. In particular, \[ \begin{bmatrix} a & b \\ c & d \end{bmatrix} \, \begin{bmatrix} x \\ y \end{bmatrix} \;=\; \begin{bmatrix} ax + by \\ cx + dy \end{bmatrix}. \]

The resulting entries are exactly the dot product between the two rows and the vector, respectively.

\[ \begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix} \, \begin{bmatrix} x \\ y \end{bmatrix} = \begin{bmatrix} 1 x + 2 y \\ 3 x + 4 y \end{bmatrix} \]

In general, for an $m \times n$ matrix and a column vector in $\mathbb{R}^n$ \[ \begin{bmatrix} a_{11} & \cdots & a_{1n} \\ \vdots & \ddots & \vdots \\ a_{m1} & \cdots & a_{mn} \end{bmatrix} \begin{bmatrix} x_1 \\ \vdots \\ x_n \end{bmatrix} = \begin{bmatrix} a_{11} x_1 + \cdots + a_{1n} x_n \\ \vdots \\ a_{m1} x_1 + \cdots + a_{mn} x_n \\ \end{bmatrix} \] The result is a column vector in $\mathbb{R}^m$.

Review: Matrix-vector product as linear functions

For every linear function $f : \mathbb{R}^n \to \mathbb{R}^m$, there exists a unique $m \times n$ matrix $A$ such that \[ f(\mathbf{x}) = A \, \mathbf{x}. \]

Indeed, matrix-vector products are exactly the representations of linear functions between (finite-dimensional) vector spaces.

This is the main reason that define matrix-vector product in such a strange way. This is also why they are so useful.

What about matrix-matrix products?

Can we multiply two matrices together? Answer: Yes... but only when their dimensions match in a very special way.

Note that here we use the word dimension for rows $\times$ columns, which is not to be confused with the dimension of a vector space.

The definition of products of matrices appears complicated. It is much easier to explain if we write matrices in terms of columns and rows.

For a $4 \times 2$ matrix $A$ and a $2 \times 3$ matrix $B$, can we write $B = [ \mathbf{b}_1 \; \mathbf{b}_2 \; \mathbf{b}_3]$, where $\mathbf{b}_1,\mathbf{b}_2,\mathbf{b}_3$ are its three columns. Then \[ AB = A \begin{bmatrix} \mathbf{b}_1 & \mathbf{b}_2 & \mathbf{b}_3 \end{bmatrix} = \begin{bmatrix} A \mathbf{b}_1 & A \mathbf{b}_2 & A \mathbf{b}_3 \end{bmatrix}. \]

The product $AB$ is a $4 \times 3$ matrix.

That is, the matrix-matrix product $AB$ can be understood as the the product between $A$ and each of the columns of $B$.

General matrix-matrix product

In general, if $B$ has $\ell$ columns, and we write \[ B = \begin{bmatrix} \mathbf{b}_1 & \cdots & \mathbf{b}_\ell \end{bmatrix} \]

then the product $AB$ also has $\ell$ columns and \[ AB = \begin{bmatrix} A \mathbf{b}_1 & \cdots & A \mathbf{b}_\ell \end{bmatrix}. \]

We can see that for this definition to make sense, the number of columns that $A$ has must match the number of rows that $B$ has.

That is, we can only multiply matrices of dimensions \begin{align*} m \; &\times \; n &&\text{and} & n \; &\times \; \ell \end{align*} (the two number in the middle must match, otherwise the above definition is meaningless).

The product $AB$ will be an $m \times \ell$ matrix, i.e., it has the same number of rows as $A$ and same number of columns as $B$.

Exercises

Following the interpretation \[ AB = A \begin{bmatrix} \mathbf{b}_1 & \cdots & \mathbf{b}_\ell \end{bmatrix} \begin{bmatrix} A \mathbf{b}_1 & \cdots & A \mathbf{b}_\ell \end{bmatrix}. \] compute the following matrix-matrix product.

Compute the matrix-matrix product \[ \begin{bmatrix} 1 & 2 \\ 3 & 4 \\ \end{bmatrix} \begin{bmatrix} 2 & 1 \\ 0 & 4 \\ \end{bmatrix}. \]

Compute \[ \begin{bmatrix} 1 & 2 \\ 3 & 4 \\ 0 & 2 \end{bmatrix} \begin{bmatrix} 0 & 3 & 1 & 0\\ 1 & 2 & 0 & 4 \end{bmatrix}. \]

Remember that \begin{align*} m \; &\times \; n &&\text{times} & n \; &\times \; \ell &&\longrightarrow & m \times \ell \end{align*}

The dot product formula

Equivalently, we can also describe entries of the product $AB$ as the dot product between rows of $A$ and columns of $B$.

For example, if $A$ has 3 rows and $B$ has 2 columns, we can write \begin{align*} A &= \begin{bmatrix} \mathbf{a}_1^\top \\ \mathbf{a}_2^\top \\ \mathbf{a}_3^\top \end{bmatrix} &&\text{and} & B &= \begin{bmatrix} \mathbf{b}_1 & \mathbf{b}_2 \end{bmatrix} \end{align*} Here the $(\,)^\top$ simply emphasize that the vectors are row vectors.

(Assuming $AB$ is defined) Then $AB$ is a $3 \times 2$ matrix and \[ AB = \begin{bmatrix} \mathbf{a}_1 \cdot \mathbf{b}_1 & \mathbf{a}_1 \cdot \mathbf{b}_2 \\ \mathbf{a}_2 \cdot \mathbf{b}_1 & \mathbf{a}_2 \cdot \mathbf{b}_2 \\ \mathbf{a}_3 \cdot \mathbf{b}_1 & \mathbf{a}_3 \cdot \mathbf{b}_2 \\ \end{bmatrix} \]

Entry formula

More formally, we can also express matrix-matrix product in terms of the formula for individual entries.

Suppose $A$ is an $m \times n$ matrix and $B$ is an $n \times \ell$ matrix then the product $C = AB$ is an $m \times \ell$ matrix whose entries are given by \[ c_{ij} = \sum_{k=1}^n a_{ik} b_{kj}. \] Of course, this formula is completely equivalent to what we explained above. (Not telling us anything that we don't already know)

Matching dimensions, again

It is worth emphasizing that for a matrix-matrix product to be defined, the dimensions of the matrices must match in a specific way: \[ (m \times n) \;\text{ times }\; (n \times \ell) \;\longrightarrow\; (m \times \ell). \]

That is, for the matrix $AB$ to make sense, the number of columns in $A$ must be exactly the number of rows in $B$.

For example, the product \[ \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \end{bmatrix} \, \begin{bmatrix} 2 & 4 & 6 \\ 5 & 3 & 1 \end{bmatrix} \] is actually NOT defined.

Product notation

The matrix-matrix product between $A$ and $B$ is simply \[ AB. \]

$A \cdot B$ is not wrong, but usually avoided;
$A \times B$ is never used.

Dot product as matrix-matrix product

If we consider row and column vectors as matrices (matrices having only one row or one column), then dot product are simply special cases of matrix-matrix product: \[ \begin{bmatrix} a & b & c \end{bmatrix} \, \begin{bmatrix} x \\ y \\ z \end{bmatrix} = ax + by + cz \] looks just like the dot product. Note that the vector on the left must be a row vector, otherwise this "matrix-matrix" product is not defined.

Therefore, the dot product between column vectors $\mathbf{u}$ and $\mathbf{v}$ can also be written as \[ \mathbf{u} \cdot \mathbf{v} = \mathbf{u}^\top \mathbf{v} \]

Matrix product with the identity matrix

It is easy to see that \[ \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} \, \begin{bmatrix} a & b \\ c & d \end{bmatrix} \;=\; \begin{bmatrix} a & b \\ c & d \end{bmatrix} \] That is, multiplication with the identity matrix "does nothing".

In general, for an $m \times n$ matrix $A$, \[ I_m \, A = A \] (Here, $A$ does not have to be square, but $I_m$ is always square)

Similarly, for an $m \times n$ matrix $A$, \[ A \, I_n = A \]

Again, the identity matrix $I$ behave just like 1. Indeed, it exactly the mirror image 1 in the world of matrices (called unity).

Algebraic properties

It is straightforward to verify that for matrices $A,B,C$, \begin{align*} IA &= A & (A+B)C &= AC + BC & (AB)C &= A(BC) \\ AI &= A & A(B+C) &= AB + AC & (AB)^\top &= B^\top A^\top \end{align*} as long as their dimensions allow the products to be properly defined.

Similarly, for a matrices $A,B$ and a vector $\mathbf{v}$ \begin{align*} A(B\mathbf{v}) &= (AB) \mathbf{v} & (\mathbf{v}^\top A)B &= \mathbf{v}^\top (AB) \end{align*} as long as their dimensions allow these products to be properly defined. (In particular, the vector $\mathbf{v}^\top$ in the second equation has to be a row vector).

What about commutativity?

A notable missing property is the "commutativity".

In general, can we expect \[ A B = B A \] for two matrices $A$ and $B$? Why?

Matrix power

For a square matrix $A$, its product with itself is well defined, and naturally we use the notation \[ A^2 = A A. \]

Similarly, \[ A^3 = A A A. \]

In general, we use the notation \[ A^d = \overbrace{A \cdots A}^d \] for any positive integer $d$.

Connection to functions

Recall that linear functions (a.k.a. transformations or maps) between (finite dimensional) vector spaces are represented by matrices.

For example, suppose $f : \mathbb{R}^2 \to \mathbb{R}^2$ and $g : \mathbb{R}^2 \to \mathbb{R}^2$ are both linear functions. Then there are $2 \times 2$ matrices $A$ and $B$ such that \begin{align*} f(\mathbf{x}) &= A \mathbf{x} & &\text{and} & g(\mathbf{x}) &= B \mathbf{x} & \end{align*}

And the function composition can be represented by matrix-matrix product in the sense that \[ g(f(\mathbf{x})) = B(A(\mathbf{x})) = (B A) \mathbf{x}. \]

That is, the product $BA$ is exactly the representation of the function composition $g \circ f$. Note that the order in which we write down the factors is consistent with their corresponding ordering in the function composition but opposite of the order they are applied.