Welcome back to our series on linear algebra for machine learning! In this post, we’re diving into Tensors and Higher-Order Generalizations, extending the concepts of vectors and matrices to multi-dimensional arrays that power deep learning, natural language processing (NLP), and computer vision. Tensors are the fundamental data structures in frameworks like PyTorch, enabling efficient computation through vectorization and broadcasting. Whether you’re processing images, text embeddings, or time series data, understanding tensors is essential. Let’s explore the math, intuition, and implementation with Python code using PyTorch, visualizations, and hands-on exercises.
A tensor is a multi-dimensional array that generalizes scalars (0D), vectors (1D), and matrices (2D) to higher dimensions. In machine learning, tensors represent data and model parameters in a unified way. For instance:
Mathematically, a tensor is an object in a multi-linear space, but in practice, we often think of it as a container for numerical data with multiple axes. Operations on tensors—like addition, multiplication, and reshaping—extend the linear algebra operations we’ve covered for vectors and matrices.
(3, 4, 5)
for a 3D tensor), and the rank is the number of dimensions (e.g.,
rank 3).Tensors are the backbone of data representation in deep learning frameworks, where efficiency is achieved through vectorized operations on multi-dimensional arrays.
Tensors are indispensable in machine learning for several reasons:
Understanding tensor operations, shapes, and broadcasting is critical for designing and debugging modern ML models.
Let’s explore tensor creation, manipulation, and operations using PyTorch. We’ll cover shape tricks, broadcasting, and visualization of tensor operations for intuition.
import torch
import numpy as np
# Set random seed for reproducibility
torch.manual_seed(42)
# Create tensors of different ranks and shapes
scalar = torch.tensor(5.0) # 0D tensor (scalar)
vector = torch.tensor([1.0, 2.0]) # 1D tensor (vector), adjusted to length 2 to match matrix column
matrix = torch.tensor([[1.0, 2.0], [3.0, 4.0]]) # 2D tensor (matrix)
tensor_3d = torch.randn(2, 3, 4) # 3D tensor (random values)
print("Scalar (0D):", scalar, "Shape:", scalar.shape)
print("Vector (1D):", vector, "Shape:", vector.shape)
print("Matrix (2D):", matrix, "Shape:", matrix.shape)
print("3D Tensor:", tensor_3d, "Shape:", tensor_3d.shape)
# Basic operations
sum_tensor = vector + 2.0 # Element-wise addition with scalar
product_tensor = vector * matrix[:, 0] # Element-wise multiplication (now compatible shapes)
matmul_result = torch.matmul(matrix, matrix) # Matrix multiplication
print("\nElement-wise addition (vector + scalar):", sum_tensor)
print("Element-wise multiplication (vector * matrix column):", product_tensor)
print("Matrix multiplication (matrix @ matrix):", matmul_result)
Output (abbreviated):
Scalar (0D): tensor(5.) Shape: torch.Size([])
Vector (1D): tensor([1., 2., 3.]) Shape: torch.Size([3])
Matrix (2D): tensor([[1., 2.],
[3., 4.]]) Shape: torch.Size([2, 2])
3D Tensor: tensor([[[ 1.9269, 1.4873, 0.9007, -2.1055],
[ 0.6784, -1.2345, -0.0431, -1.6047],
[-0.7521, 1.6487, -0.3925, -1.4036]],
[[-0.7279, -0.5593, -0.7688, 0.7624],
[ 1.6423, -0.1596, -0.4974, 0.4396],
[-0.7581, 1.0783, 0.8008, 1.6806]]]) Shape: torch.Size([2, 3, 4])
Element-wise addition (vector + scalar): tensor([3., 4., 5.])
Element-wise multiplication (vector * matrix column): tensor([1., 6., 9.])
Matrix multiplication (matrix @ matrix): tensor([[ 7., 10.],
[15., 22.]])
This example demonstrates creating tensors of different ranks (0D to 3D) in PyTorch and performing basic operations like element-wise addition, multiplication, and matrix multiplication. The shapes are printed to show the dimensions of each tensor.
# Broadcasting example
tensor_a = torch.tensor([1.0, 2.0, 3.0]) # Shape: (3,)
tensor_b = torch.tensor([[1.0], [2.0]]) # Shape: (2, 1)
result_broadcast = tensor_a + tensor_b # Broadcasting: (2, 3)
print("\nBroadcasting result (tensor_a + tensor_b):", result_broadcast)
print("Result shape:", result_broadcast.shape)
# Shape tricks: Reshaping and unsqueezing
tensor_c = torch.randn(4, 3) # Shape: (4, 3)
tensor_reshaped = tensor_c.reshape(2, 6) # Reshape to (2, 6)
tensor_unsqueezed = tensor_c.unsqueeze(0) # Add dimension at index 0, Shape: (1, 4, 3)
print("\nOriginal tensor shape:", tensor_c.shape)
print("Reshaped tensor shape:", tensor_reshaped.shape)
print("Unsqueezed tensor shape:", tensor_unsqueezed.shape)
Output (abbreviated):
Broadcasting result (tensor_a + tensor_b): tensor([[2., 3., 4.],
[3., 4., 5.]])
Result shape: torch.Size([2, 3])
Original tensor shape: torch.Size([4, 3])
Reshaped tensor shape: torch.Size([2, 6])
Unsqueezed tensor shape: torch.Size([1, 4, 3])
This example illustrates broadcasting, where tensors of different shapes are combined by automatically expanding dimensions, and shape manipulation tricks like reshaping and unsqueezing to adjust tensor dimensions for compatibility in operations.
Here are six exercises to deepen your understanding of tensors and higher-order generalizations. Each exercise requires writing Python code to explore concepts and applications in machine learning using PyTorch.
numel()
) to confirm
the dimensions.torch.bmm
to perform batch matrix multiplication and print the resulting
shape. Explain the operation in a comment.permute
and print
the shapes before and after to confirm the change.Tensors and Higher-Order Generalizations extend the power of linear algebra to multi-dimensional data, forming the foundation of deep learning, NLP, and computer vision. By mastering tensor creation, broadcasting, and shape manipulation in PyTorch, we’ve seen how to handle complex data structures efficiently. These concepts are critical for scaling machine learning models to real-world problems involving images, text, and beyond.
In the next post, we’ll explore Spectral Methods in ML (Graph Laplacians, etc.), diving into how linear algebra powers graph-based algorithms and clustering techniques. Stay tuned, and happy learning!