How to encoding images as vectors in a vector space.
Digital images may be encoded as vectors in certain vector spaces. We already know of the basic ideas, in this lecture we will take a closer look at some of the technical detail.
Digital images (uncompressed "bitmaps") are usually stored as a collection of "pixels" in computers.
Each pixel can be represented as a number or a series of numbers.
(This is not always the case, there are "vector" graphics)
In our homework, we have already explored the basic idea of representing pixels as vectors and images as even longer vectors.
On the level of individual pixels...
Of course, there are images that are actually a combination of color information together with brightness information. Such formats are common in scientific instruments.
For a grayscale image, each pixel is completely described by its "brightness", which can be represented by a single number. (Usually larger number means brighter, and smaller number means darker)
Can we encode this $3 \times 3$ image as a vector in $\mathbb{R}^9$? (One number for each pixel)
For images with color, each pixel must be represented by several numbers. In the most commonly used "RGB" scheme, each pixel is represented by 3 numbers: the Red, Green, and Blue component of the color.
Can we encode this $3 \times 3$ image as a vector in $\mathbb{R}^9$?
Nearly all modern computers use the 24-bit color depth (8-bit per channel for the 3 channels). In this, each of the "R", "G", "B" value range from 0 to 255 with 0 being the lowest intensity and 255 being the highest intensity ($255 = 2^8 - 1$). When combined, they can represent over 16 million colors ($2^{24}$ distinct colors to be exact).
Consider the three unit vectors \[ \mathbb{r} = \color{red}{ \begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix} } \;,\quad \mathbb{g} = \color{green}{ \begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix} } \;,\quad \mathbb{b} = \color{blue}{ \begin{bmatrix} 0 \\ 0 \\ 1 \end{bmatrix} } \] they represent unit amount of red, green, and blue, respectively.
Any color can be represented as a linear combination of these three. E.g., \[ \frac{1}{2} \color{red} { \mathbf{r} } + \frac{1}{2} \color{blue}{ \mathbf{b} } = \frac{1}{2} \color{red}{ \begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix} } + \frac{1}{2} \color{blue}{ \begin{bmatrix} 0 \\ 0 \\ 1 \end{bmatrix} } = \color{purple}{ \begin{bmatrix} \frac{1}{2} \\ 0 \\ \frac{1}{2} \end{bmatrix} } \]
This setup resembles $\mathbf{R}^3$.
Of course, the "RGB vector space" we just described does not match the reality of display technology exactly.
The entries in the vector \[ \begin{bmatrix} \color{red}{R} \\ \color{green}{G} \\ \color{blue}{B} \\ \end{bmatrix} \] can be arbitrarily large, but any physical display would have some limits on the output of light.
Entries can also be negative, e.g., \[ \begin{bmatrix} \color{red}{-0.5} \\ \color{green}{0.2} \\ \color{blue}{0.1} \\ \end{bmatrix} \] does not produce a "real" color.
How can we make sense of such "virtual" color space?
You should have already seen the concept of "HSV color space" in homework assignments or projects.
HSV (Hue, Saturation, Value), and its variations, is a more intuitive alternative to the RGB color representation.
The HSV color space is best visualized as a cylinder.
(Image source: Wikipedia and Michigan State University)
(Note the subtle difference between HSV and HSL)
In this setup, a color can also be represented by \[ \begin{bmatrix} H \\ S \\ V \end{bmatrix} \in \mathbb{R}^3 \] representing the Hue, Saturation, and Value, respectively.
Note that this also does not match $\mathbb{R}^3$ exactly since as the value of $H$ increases, eventually, we will come back to the same color.