Welcome to the second post in our series on Linear Algebra for Machine Learning! After exploring vectors, scalars, and vector spaces, we’re now diving into matrices—powerful tools that represent data and transformations in machine learning (ML). In this post, we’ll cover the mathematical foundations of matrices, their role in ML as both data containers and transformation operators, and how to work with them in Python using NumPy and PyTorch. We’ll also visualize matrices and provide Python exercises to deepen your understanding.
A matrix is a rectangular array of scalars arranged in rows and columns. A matrix with rows and columns (an matrix) is written as:
Each element is a scalar, where is the row index and is the column index. In ML, matrices are used to:
A matrix can represent structured data. For example:
A matrix can act as a linear transformation, mapping vectors to new vectors. For an matrix and a vector , the transformation is:
where . Each element of is a linear combination of ’s components, weighted by the rows of . In ML, this is the core of linear layers in neural networks, where represents weights.
Geometrically, matrices can rotate, scale, or shear vectors. For example, a 2D rotation matrix by angle :
rotates a vector in counterclockwise by .
Matrices are central to ML:
Understanding matrices helps you manipulate data and design models efficiently.
Let’s explore matrices using NumPy and PyTorch, focusing on data representation (e.g., images) and transformations (e.g., rotations). We’ll also visualize matrices and their effects.
Install the required libraries if needed:
pip install numpy torch matplotlib torchvision
Let’s load an MNIST image and treat it as a matrix:
import numpy as np
import torch
import torchvision
import matplotlib.pyplot as plt
# Load MNIST dataset
mnist_dataset = torchvision.datasets.MNIST(
root='./data',
train=True,
download=True,
transform=torchvision.transforms.ToTensor()
)
# Get a single image
image, label = mnist_dataset[0]
# Convert to NumPy array (shape: 1, 28, 28 -> 28, 28)
image_matrix = image.squeeze().numpy()
# Print shape and sample elements
print("Image matrix shape:", image_matrix.shape)
print("Top-left 3x3 corner:\n", image_matrix[:3, :3])
# Visualize the image matrix
plt.figure(figsize=(5("Image as Matrix")
plt.imshow(image_matrix, cmap='gray')
plt.title(f"MNIST Digit: {label}")
plt.colorbar(label='Pixel Intensity')
plt.show()
Output:
Image matrix shape: (28, 28)
Top-left 3x3 corner:
[[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]]
This code loads an MNIST image as a matrix, prints its shape and a 3x3 corner, and visualizes it as a grayscale image.
Matrices can be reshaped to change their dimensions while preserving data. Let’s flatten the image into a vector:
# Flatten the image matrix into a vector
image_vector = image_matrix.flatten()
# Print new shape
print("Flattened vector shape:", image_vector.shape)
# Reshape back to 28x28
image_matrix_reshaped = image_vector.reshape(28, 28)
# Verify shapes match
print("Reshaped matrix shape:", image_matrix_reshaped.shape)
Output:
Flattened vector shape: (784,)
Reshaped matrix shape: (28, 28)
This demonstrates how images can be converted to vectors for ML models and reshaped back.
Let’s apply a rotation matrix to a 2D vector:
# Define a 2D vector
vector = np.array([1, 0])
# Define a 90-degree rotation matrix (pi/2 radians)
theta = np.pi / 2
rotation_matrix = np.array([
[np.cos(theta), -np.sin(theta)],
[np.sin(theta), np.cos(theta)]
])
# Apply transformation
rotated_vector = rotation_matrix @ vector # Matrix-vector multiplication
# Print results
print("Original vector:", vector)
print("Rotation matrix:\n", rotation_matrix)
print("Rotated vector:", rotated_vector)
# Visualize original and rotated vectors
def plot_2d_vectors(vectors, labels, colors):
plt.figure(figsize=(6, 6))
origin = np.zeros(2)
for vec, label, color in zip(vectors, labels, colors):
plt.quiver(*origin, *vec, color=color, scale=1, scale_units='xy', angles='xy')
plt.text(vec[0], vec[1], label, color=color, fontsize=12)
plt.grid(True)
plt.xlim(-2, 2)
plt.ylim(-2, 2)
plt.axhline(0, color='black', linewidth=0.5)
plt.axvline(0, color='black', linewidth=0.5)
plt.title("Vector Rotation")
plt.show()
plot_2d_vectors(
[vector, rotated_vector],
['Original', 'Rotated'],
['blue', 'red']
)
Output:
Original vector: [1 0]
Rotation matrix:
[[ 6.123234e-17 -1.000000e+00]
[ 1.000000e+00 6.123234e-17]]
Rotated vector: [ 6.123234e-17 1.000000e+00]
This rotates the vector by 90 degrees, resulting in (with small numerical errors), and plots both vectors.
Let’s perform the same rotation using PyTorch:
# Convert to PyTorch tensors
vector_torch = torch.tensor([1.0, 0.0])
rotation_matrix_torch = torch.tensor([
[torch.cos(torch.tensor(np.pi/2)), -torch.sin(torch.tensor(np.pi/2))],
[torch.sin(torch.tensor(np.pi/2)), torch.cos(torch.tensor(np.pi/2))]
])
# Matrix-vector multiplication
rotated_vector_torch = rotation_matrix_torch @ vector_torch
print("PyTorch rotated vector:", rotated_vector_torch)
Output:
PyTorch rotated vector: tensor([ 0., 1.])
This confirms PyTorch’s matrix operations align with NumPy’s.
Try these Python exercises to solidify your understanding. Solutions will be discussed in the next post!
In the next post, we’ll explore matrix arithmetic—addition, scaling, and multiplication—and their roles in ML, such as linear combinations and weighted sums. We’ll dive deeper into NumPy and PyTorch operations with more examples and exercises.
Happy learning, and see you in Part 3!