A product of an orthogonal and triangular matrix
The QR-decomposition is a decomposition of a matrix into a product of an orthogonal matrix and a upper triangular matrix.
A decomposition of a matrix is simply the process of expressing a matrix as the product of two or more matrices. E.g., \[ A = X Y \]
This is similar to the problem of factorize integers or polynomials, which is why matrix decompositions are also known as matrix factorizations.
There are many useful decomposition methods, each has its own use case. They are crucial in numerical computations.
It may be difficult to appreciate the usefulness of these decompositions in classrooms. Once you see a large matrix from applications, you will understand why we need them.
QR-decomposition is a special of decomposing a matrix that is particularly useful in solving linear least-squares problems. It is closely related to the "Gram-Schmidt process" that is used to create orthogonal basis.
Recall that $Q$, being orthogonal, must be nonsingular.
Also, $R$ is nonsingular if and only if $A$ is nonsingular.
It is inherently difficult to appreciate the value of the QR-decomposition (or matrix decomposition in general) in classroom-settings.
The real usefulness only become apparent when dealing with very large matrices (real-world applications) or matrices with floating-point entries (numerical analysis).
If it is known that $A = QR$, then the problem of solving \[ A \mathbf{x} = \mathbf{b} \] Can be turned into the problem \[ R \mathbf{x} = Q^\top \mathbf{b} \]
The true benefit is that orthogonal matrices have the best numerical condition, which is crucially important in numerical computations. You need to study numerical analysis to see what's going on.
Recall that orthogonal linear transformations preserve vector norms. That is, \[ \| Q \mathbf{x} \| = \| \mathbf{x} \| \] for an orthogonal $n \times n$ matrix and any vector $\mathbf{x} \in \mathbb{R}^n$.
This observation gives us a very nice way for solving linear least-squares problems.
Through QR-decomposition, general linear least-squares problems can be reduced to linear least-squares problems involving only upper triangular matrices, which are easy to solve.
Suppose \[ R = \begin{bmatrix} * & * & * \\ 0 & * & * \\ 0 & 0 & * \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{bmatrix} \] where "$*$" are placeholders for potentially nonzero numbers whose values are not important.
The "$*$"-notations is used widely in discussions of numerical computations so that we don't get too distracted by concrete numbers.
How should we minimize \[ \left\| \begin{bmatrix} * & * & * \\ 0 & * & * \\ 0 & 0 & * \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} - \begin{bmatrix} * \\ * \\ * \\ * \\ * \end{bmatrix} \right\|^2 \]
The existence of QR-decomposition for a matrix is equivalent to the possibility of carrying out "Gram-Schmidt" process (See orthonormal basis for detail).