Welcome to the third post in our series on Linear Algebra for Machine Learning! Having covered vectors and matrices as data and transformations, we now turn to matrix arithmetic: addition, scaling, and multiplication. These operations are the building blocks of many machine learning (ML) algorithms, enabling linear combinations and weighted sums critical for models like neural networks. In this post, we’ll explore the mathematics behind these operations, their ML applications, and how to implement them in Python using NumPy and PyTorch. We’ll also include visualizations and Python exercises to reinforce your understanding.
Matrix addition combines two matrices of the same dimensions by adding corresponding elements. For two matrices and :
For example, if:
then:
Addition is commutative () and associative ().
Scaling a matrix multiplies every element by a scalar. For a matrix and scalar :
For example, if and :
Matrix multiplication combines two matrices to produce a new matrix, representing the composition of linear transformations or weighted sums. For an matrix and an matrix , the product is an matrix where:
Each element is the dot product of the -th row of and the -th column of . For example:
Matrix multiplication is not commutative () but is associative () and distributive over addition ().
In ML, broadcasting allows operations on arrays of different shapes by automatically expanding dimensions. For example, adding a scalar to a matrix or a vector to each row/column leverages broadcasting to align shapes.
Matrix arithmetic is fundamental to ML:
These operations underpin algorithms like linear regression, neural networks, and principal component analysis (PCA).
Let’s implement matrix addition, scaling, multiplication, and broadcasting using NumPy and PyTorch, with visualizations to illustrate their effects.
Install the required libraries if needed:
pip install numpy torch matplotlib
Let’s add two matrices:
import numpy as np
import matplotlib.pyplot as plt
# Define two 2x2 matrices
A = np.array([[1, 2],
[3, 4]])
B = np.array([[5, 6],
[7, 8]])
# Matrix addition
C = A + B
# Print results
print("Matrix A:\n", A)
print("Matrix B:\n", B)
print("A + B:\n", C)
# Visualize matrices as heatmaps
plt.figure(figsize=(12, 4))
plt.subplot(1, 3, 1)
plt.imshow(A, cmap='viridis')
plt.title('Matrix A')
plt.colorbar()
plt.subplot(1, 3, 2)
plt.imshow(B, cmap='viridis')
plt.title('Matrix B')
plt.colorbar()
plt.subplot(1, 3, 3)
plt.imshow(C, cmap='viridis')
plt.title('A + B')
plt.colorbar()
plt.tight_layout()
plt.show()
Output:
Matrix A:
[[1 2]
[3 4]]
Matrix B:
[[5 6]
[7 8]]
A + B:
[[ 6 8]
[10 12]]
This code adds two matrices and visualizes them as heatmaps, showing how corresponding elements combine.
Let’s scale a matrix:
# Scale matrix A by 2
k = 2
scaled_A = k * A
# Print result
print("Scalar k:", k)
print("Scaled matrix (k * A):\n", scaled_A)
# Visualize
plt.figure(figsize=(8, 4))
plt.subplot(1, 2, 1)
plt.imshow(A, cmap='viridis')
plt.title('Matrix A')
plt.colorbar()
plt.subplot(1, 2, 2)
plt.imshow(scaled_A, cmap='viridis')
plt.title('k * A')
plt.colorbar()
plt.tight_layout()
plt.show()
Output:
Scalar k: 2
Scaled matrix (k * A):
[[2 4]
[6 8]]
This scales matrix by 2, doubling each element, and visualizes the result.
Let’s multiply matrices:
# Matrix multiplication
AB = A @ B # or np.matmul(A, B)
# Print result
print("Matrix A:\n", A)
print("Matrix B:\n", B)
print("A @ B:\n", AB)
Output:
Matrix A:
[[1 2]
[3 4]]
Matrix B:
[[5 6]
[7 8]]
A @ B:
[[19 22]
[43 50]]
This computes using NumPy’s @
operator, showing the weighted sums.
Let’s add a vector to each row of a matrix using broadcasting:
# Define a vector
v = np.array([1, -1])
# Add vector to each row of A
A_plus_v = A + v # Broadcasting automatically expands v to match A's shape
# Print result
print("Vector v:", v)
print("Matrix A:\n", A)
print("A + v (broadcasted):\n", A_plus_v)
Output:
Vector v: [ 1 -1]
Matrix A:
[[1 2]
[3 4]]
A + v (broadcasted):
[[2 1]
[4 3]]
Broadcasting adds to each row of , equivalent to adding .
Let’s perform multiplication in PyTorch:
import torch
# Convert to PyTorch tensors
A_torch = torch.tensor(A, dtype=torch.float32)
B_torch = torch.tensor(B, dtype=torch.float32)
# Matrix multiplication
AB_torch = A_torch @ B_torch
# Print result
print("PyTorch A @ B:\n", AB_torch.numpy())
Output:
PyTorch A @ B:
[[19. 22.]
[43. 50.]]
This confirms PyTorch’s results match NumPy’s.
Try these Python exercises to deepen your understanding. Solutions will be discussed in the next post!
In the next post, we’ll dive into dot products and cosine similarity, exploring their role in measuring similarity for tasks like word embeddings and recommendation systems. We’ll provide more Python examples and exercises to keep building your ML intuition.
Happy learning, and see you in Part 4!