Welcome to the sixth post in our series on Linear Algebra for Machine Learning! After exploring linear independence and span, we now turn to norms and distances, essential tools for quantifying vector magnitudes and differences between data points. These concepts are critical in machine learning (ML) for loss functions, regularization, and gradient scaling. In this post, we’ll cover their mathematical foundations, their applications in ML, and how to implement them in Python using NumPy and PyTorch. We’ll include visualizations and Python exercises to deepen your understanding.
A norm measures the “size” or “length” of a vector. For a vector , common norms include:
L1 Norm (Manhattan norm):
This sums the absolute values of the components, emphasizing sparsity in ML.
L2 Norm (Euclidean norm):
This measures the straight-line distance from the origin, widely used in loss functions.
L∞ Norm (Maximum norm):
This takes the largest absolute component, useful in robustness analysis.
Norms satisfy properties like non-negativity (), scalability (), and the triangle inequality ().
A distance measures how far apart two vectors are. For vectors , the distance is typically the norm of their difference:
L1 Distance:
L2 Distance (Euclidean distance):
L∞ Distance:
Distances are non-negative, symmetric (), and satisfy the triangle inequality.
Norms and distances are foundational in ML:
Mastering these concepts helps you design robust ML models and evaluate their performance.
Let’s compute L1, L2, and L∞ norms and distances using NumPy and PyTorch, with visualizations to illustrate their geometric interpretations.
Install the required libraries if needed:
pip install numpy torch matplotlib
Let’s compute norms for a vector:
import numpy as np
import matplotlib.pyplot as plt
# Define a vector
v = np.array([3, 4])
# Compute norms
l1_norm = np.sum(np.abs(v)) # L1 norm
l2_norm = np.linalg.norm(v) # L2 norm
linf_norm = np.max(np.abs(v)) # L∞ norm
# Print results
print("Vector v:", v)
print("L1 norm:", l1_norm)
print("L2 norm:", l2_norm)
print("L∞ norm:", linf_norm)
# Visualize vector
plt.figure(figsize=(6, 6))
plt.quiver(0, 0, v[0], v[1], color='blue', scale=1, scale_units='xy', angles='xy')
plt.text(v[0], v[1], 'v', color='blue', fontsize=12)
plt.grid(True)
plt.xlim(-5, 5)
plt.ylim(-5, 5)
plt.axhline(0, color='black', linewidth=0.5)
plt.axvline(0, color='black', linewidth=0.5)
plt.xlabel('X')
plt.ylabel('Y')
plt.title("Vector for Norms")
plt.show()
Output:
Vector v: [3 4]
L1 norm: 7
L2 norm: 5.0
L∞ norm: 4
This computes the L1 (), L2 (), and L∞ () norms for , and plots the vector.
Let’s compute distances between two vectors:
# Define another vector
u = np.array([1, 1])
# Compute distances
l1_dist = np.sum(np.abs(u - v))
l2_dist = np.linalg.norm(u - v)
linf_dist = np.max(np.abs(u - v))
# Print results
print("Vector u:", u)
print("Vector v:", v)
print("L1 distance:", l1_dist)
print("L2 distance:", l2_dist)
print("L∞ distance:", linf_dist)
# Visualize vectors and distance
plt.figure(figsize=(6, 6))
plt.quiver(0, 0, u[0], u[1], color='red', scale=1, scale_units='xy', angles='xy')
plt.quiver(0, 0, v[0], v[1], color='blue', scale=1, scale_units='xy', angles='xy')
plt.plot([u[0], v[0]], [u[1], v[1]], 'g--', label='L2 distance')
plt.text(u[0], u[1], 'u', color='red', fontsize=12)
plt.text(v[0], v[1], 'v', color='blue', fontsize=12)
plt.grid(True)
plt.xlim(-1, 5)
plt.ylim(-1, 5)
plt.axhline(0, color='black', linewidth=0.5)
plt.axvline(0, color='black', linewidth=0.5)
plt.xlabel('X')
plt.ylabel('Y')
plt.title("Vectors and L2 Distance")
plt.legend()
plt.show()
Output:
Vector u: [1 1]
Vector v: [3 4]
L1 distance: 5
L2 distance: 3.605551275463989
L∞ distance: 3
This computes the L1 (), L2 (), and L∞ () distances between and . The plot shows both vectors and a dashed line representing the L2 distance.
Let’s compute norms and distances in PyTorch:
import torch
# Convert to PyTorch tensors
u_torch = torch.tensor(u, dtype=torch.float32)
v_torch = torch.tensor(v, dtype=torch.float32)
# Compute norms for v
l1_norm_torch = torch.sum(torch.abs(v_torch))
l2_norm_torch = torch.norm(v_torch)
linf_norm_torch = torch.max(torch.abs(v_torch))
# Compute distances
l1_dist_torch = torch.sum(torch.abs(u_torch - v_torch))
l2_dist_torch = torch.norm(u_torch - v_torch)
linf_dist_torch = torch.max(torch.abs(u_torch - v_torch))
# Print results
print("PyTorch L1 norm (v):", l1_norm_torch.item())
print("PyTorch L2 norm (v):", l2_norm_torch.item())
print("PyTorch L∞ norm (v):", linf_norm_torch.item())
print("PyTorch L1 distance:", l1_dist_torch.item())
print("PyTorch L2 distance:", l2_dist_torch.item())
print("PyTorch L∞ distance:", linf_dist_torch.item())
Output:
PyTorch L1 norm (v): 7.0
PyTorch L2 norm (v): 5.0
PyTorch L∞ norm (v): 4.0
PyTorch L1 distance: 5.0
PyTorch L2 distance: 3.605551242828369
PyTorch L∞ distance: 3.0
This confirms PyTorch’s results match NumPy’s, with minor floating-point differences.
Try these Python exercises to deepen your understanding. Solutions will be discussed in the next post!
In the next post, we’ll explore orthogonality and projections, key for error decomposition, PCA, and embeddings. We’ll provide more Python code and exercises to continue building your ML expertise.
Happy learning, and see you in Part 7!