Linear Algebra for Machine Learning, Part 11: Singular Value Decomposition (SVD)

2025-05-30 · Artintellica

Welcome to the eleventh post in our series on Linear Algebra for Machine Learning, continuing Part II: Core Theorems and Algorithms! After exploring eigenvalues and eigenvectors, we now dive into Singular Value Decomposition (SVD), a cornerstone for dimensionality reduction, noise filtering, and latent semantic analysis (LSA) in machine learning (ML). In this post, we’ll cover the mathematical foundations, their ML applications, and how to implement SVD in Python using NumPy and PyTorch. We’ll include a visual demo and Python exercises to solidify your understanding.


The Math: Singular Value Decomposition

Definition

For any m×nm \times n matrix AA, Singular Value Decomposition decomposes AA as:

A=UΣVTA = U \Sigma V^T

where:

  • UU is an m×mm \times m orthogonal matrix, with columns (left singular vectors) forming an orthonormal basis for Rm\mathbb{R}^m.
  • Σ\Sigma is an m×nm \times n diagonal matrix with non-negative singular values σ1σ2σr0\sigma_1 \geq \sigma_2 \geq \dots \geq \sigma_r \geq 0 (where r=min(m,n)r = \min(m, n)) on the diagonal, and zeros elsewhere.
  • VV is an n×nn \times n orthogonal matrix, with columns (right singular vectors) forming an orthonormal basis for Rn\mathbb{R}^n.
  • VTV^T is the transpose of VV.

The singular values in Σ\Sigma capture the “importance” of each dimension, and the rank of AA is the number of non-zero singular values.

Geometric Intuition

SVD interprets AA as a composition of three transformations:

  1. Rotation/Reflection: VTV^T rotates or reflects the input space.
  2. Scaling: Σ\Sigma scales along the principal axes (by singular values).
  3. Rotation/Reflection: UU rotates or reflects into the output space.

For example, if AA represents a dataset, SVD identifies the principal directions (via UU and VV) and their magnitudes (via Σ\Sigma).

Low-Rank Approximation

A rank-kk approximation of AA is:

Ak=UkΣkVkTA_k = U_k \Sigma_k V_k^T

where UkU_k is m×km \times k, Σk\Sigma_k is k×kk \times k, and VkV_k is n×kn \times k, using the top kk singular values and vectors. This minimizes the Frobenius norm AAkF\|A - A_k\|_F, ideal for compression and noise reduction.


ML Context: Why SVD Matters

SVD is pivotal in ML:

  • Dimensionality Reduction: Low-rank approximations reduce data size while preserving structure, used in PCA and recommender systems.
  • Noise Filtering: Truncating small singular values removes noise, enhancing signal in images or audio.
  • Latent Semantic Analysis (LSA): In NLP, SVD uncovers latent topics in document-term matrices.
  • Matrix Completion: SVD helps impute missing data in collaborative filtering.

Mastering SVD enables efficient data processing and robust model design.


Python Code: Singular Value Decomposition

Let’s compute SVD, create a low-rank approximation, and visualize its effects using NumPy and PyTorch.

Setup

Install the required libraries if needed:

pip install numpy torch matplotlib

Computing SVD

Let’s compute SVD for a matrix:

import numpy as np

# Create a 4x3 matrix
A = np.array([
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9],
    [10, 11, 12]
])

# Compute SVD
U, S, Vt = np.linalg.svd(A, full_matrices=True)

# Reconstruct Sigma as a 4x3 diagonal matrix
Sigma = np.zeros((A.shape[0], A.shape[1]))
np.fill_diagonal(Sigma, S)

# Reconstruct A
A_reconstructed = U @ Sigma @ Vt

# Print results
print("Matrix A (4x3):\n", A)
print("\nSingular values:", S)
print("\nU (4x4):\n", U)
print("\nSigma (4x3):\n", Sigma)
print("\nVt (3x3):\n", Vt)
print("\nReconstructed A:\n", A_reconstructed)
print("\nReconstruction matches original?", np.allclose(A, A_reconstructed))

Output:

Matrix A (4x3):
 [[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 11 12]]

Singular values: [25.46240744  1.29099445]

U (4x4):
 [[-0.1408777   0.8247362  -0.53665631 -0.10482848]
 [-0.34394629  0.42693249  0.80218019  0.25881905]
 [-0.54701488  0.02912877 -0.25881905  0.78801075]
 [-0.75008347 -0.36876343 -0.00670483 -0.54199932]]

Sigma (4x3):
 [[25.4624  0.     0.    ]
 [ 0.     1.291   0.    ]
 [ 0.     0.     0.    ]
 [ 0.     0.     0.    ]]

Vt (3x3):
 [[-0.50453315 -0.5745155  -0.64449785]
 [ 0.76077568  0.05714052 -0.64649463]
 [ 0.40824829 -0.81649658  0.40824829]]

Reconstructed A:
 [[ 1.  2.  3.]
 [ 4.  5.  6.]
 [ 7.  8.  9.]
 [10. 11. 12.]]

Reconstruction matches original? True

This computes SVD, reconstructs AA, and verifies the decomposition.

Low-Rank Approximation

Let’s create a rank-1 approximation:

# Rank-1 approximation
k = 1
U_k = U[:, :k]
Sigma_k = np.diag(S[:k])
Vt_k = Vt[:k, :]
A_rank1 = U_k @ Sigma_k @ Vt_k

# Compute Frobenius norm of difference
diff_norm = np.linalg.norm(A - A_rank1, 'fro')

# Print results
print("Rank-1 approximation A_rank1:\n", A_rank1)
print("\nFrobenius norm of difference (||A - A_rank1||_F):", diff_norm)

Output:

Rank-1 approximation A_rank1:
 [[ 0.9041  1.0296  1.1532]
 [ 2.2054  2.5111  2.8128]
 [ 3.5067  3.9926  4.4724]
 [ 4.8081  5.4741  6.132 ]]

Frobenius norm of difference (||A - A_rank1||_F): 1.6641

This approximates AA using the top singular value, with some error.

Visualization

Let’s visualize the matrices:

import matplotlib.pyplot as plt

# Consistent color scale
vmin = min(A.min(), A_rank1.min())
vmax = max(A.max(), A_rank1.max())

plt.figure(figsize=(12, 4))
plt.subplot(1, 3, 1)
plt.imshow(A, cmap='viridis', vmin=vmin, vmax=vmax)
plt.title('Original A')
plt.colorbar()

plt.subplot(1, 3, 2)
plt.imshow(A_rank1, cmap='viridis', vmin=vmin, vmax=vmax)
plt.title('Rank-1 Approximation')
plt.colorbar()

plt.subplot(1, 3, 3)
plt.imshow(A - A_rank1, cmap='plasma')
plt.title('Difference')
plt.colorbar()
plt.tight_layout()
plt.show()

This plots the original matrix, rank-1 approximation, and their difference, showing compression effects.

PyTorch: SVD

Let’s compute SVD in PyTorch:

import torch

# Convert to PyTorch tensor
A_torch = torch.tensor(A, dtype=torch.float32)

# Compute SVD
U_torch, S_torch, V_torch = torch.svd(A_torch)

# Print results
print("PyTorch singular values:", S_torch.numpy())

Output:

PyTorch singular values: [25.462408  1.2909944]

This matches NumPy’s singular values.


Exercises

Try these Python exercises to deepen your understanding. Solutions will be discussed in the next post!

  1. SVD Computation: Create a 3×43 \times 4 matrix with random integers between -5 and 5. Compute its SVD using NumPy and reconstruct the matrix.
  2. Rank-k Approximation: For the matrix in Exercise 1, compute a rank-2 approximation and calculate the Frobenius norm of the difference.
  3. PyTorch SVD: Convert the matrix from Exercise 1 to a PyTorch tensor, compute its SVD, and verify the singular values match NumPy’s.
  4. Image Compression: Load a grayscale image (or create a small matrix), compute its SVD, and reconstruct a rank-10 approximation. Visualize the original and compressed images.
  5. Noise Filtering: Add random noise to a 5×55 \times 5 matrix, compute its SVD, and use a rank-3 approximation to filter noise. Compare with the original.
  6. LSA Simulation: Create a 4×34 \times 3 document-term matrix, compute its SVD, and interpret the top singular vectors as latent topics.

What’s Next?

In the next post, we’ll explore positive definite matrices, key for covariance, kernels, and optimization. We’ll provide more Python code and exercises to continue building your ML expertise.

Happy learning, and see you in Part 12!



Earlier Blog Posts


Back to Blog

Copyright © 2025 Identellica LLC
Home · Blog · Source Code