To illustrate this, we'll visualize how SVD works for 2x2 matrices. Of course this is just a special case, but it contains all the basic ideas that you need to understand SVD in other dimensions.
A lot has been written about visualizing three dimensions and beyond.
But let's talk for a moment about visualizing 2D space!
The image at left shows two things: (1) a grid representing a plane, and (2) a coin, representing the unit disk. We can use this simple visualization as a guide for understanding how matrices transform space. |
Try changing the numbers in the matrix to see the effect on the plane.
Or drag the red and orange vectors in the plane to create a new transformation, and see what matrix it defines.
|
The singular value decomposition is a way of formalizing this intuition: every matrix can be described as a combination of a few simple operations.
= |
Theorem. Let \(A \) be any real matrix. Then there are orthogonal matrices \(U \) and \(V \), and a diagonal matrix \( \Sigma \), such that \( A = U \Sigma V^t\).
In some ways, the brevity of this theorem just makes it more confusing. So let's expand on it a bit.
1. An informal way to read this is: every linear transformation can be represented by an orthogonal transformation, followed by an axis-aligned squeezing or stretching, followed by another orthogonal transformation.
2. Even though we've been visualizing \( 2 \times 2 \) matrices, it applies to any size matrix of real numbers, even matrices that aren't square. (In fact it's still true if you add complex numbers to the mix, with just a tiny variation.)
3. You might wonder why we write \( V^t \) (the transpose of \(V \) ) instead of just \( V \). The fact is that the transpose of an orthogonal matrix is also an orthogonal matrix, so it's not really necessary to have the transpose in there. But when you use the SVD in practice, it turns out this simplifies a few things.
4. The entries of the diagonal matrix are the "singular values" of the matrix, and are often written with a lower-case sigma: \( \sigma_1, \sigma_2, ... \). They tell you how much the matrix stretches or squeezes space. By convention, each \( \sigma_i \geq 0 \), and they are written in descending order, so that \( \sigma_i \geq \sigma_{i+1} \).
\(U \Sigma V^t\) | \(U\) | \( \Sigma \) | \(V\) |
|
The diagonal entries are the singular values. |