ML week 2
Adapted from 1
Week 2: Tensors & Mathematical Foundations
Objective: Understand tensors, perform tensor operations, and compute gradients using automatic differentiation.
Key Concepts Explained
1. Tensors
Definition: A tensor is a multi-dimensional array of numerical values.
Scalar: A single number (0D tensor). Example: 5.0
.
Vector: A 1D array (1D tensor). Example: [1.0, 2.0, 3.0]
.
Matrix: A 2D grid (2D tensor). Example: [[1, 2], [3, 4]]
.
Higher Dimensions: 3D (cube), 4D (time-series), etc.
Why Tensors? They unify data representation for ML models (e.g., images as 3D tensors: height × width × color channels).
import numpy as np
# Create tensors
scalar = np.array(5.0) # 0D tensor
vector = np.array([1, 2, 3]) # 1D tensor
matrix = np.array([[1, 2], [3, 4]]) # 2D tensor
2. Tensor Operations
Reshaping: Change tensor dimensions without altering data.
matrix = np.array([[1, 2], [3, 4]])
reshaped = matrix.reshape(4, 1) # Reshape to 4 rows × 1 column
Broadcasting: Automatically expand tensors to perform arithmetic.
a = np.array([1, 2, 3]) # Shape (3,)
b = np.array([[10], [20]]) # Shape (2, 1)
result = a + b # Result shape (2, 3)
# [[11, 12, 13], [21, 22, 23]]
Einsum: Compact notation for tensor operations (e.g., matrix multiplication).
A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])
C = np.einsum('ij,jk->ik', A, B) # Matrix multiplication
# [[19, 22], [43, 50]]
3. Automatic Differentiation (Autograd)
Definition: A technique to automatically compute gradients (derivatives) of functions.
Why Gradients? Gradients tell us how to adjust model parameters to reduce errors during training.
Example with PyTorch:
import torch
x = torch.tensor(3.0, requires_grad=True)
y = x ** 2 + 2 * x + 1 # Function: y = x² + 2x + 1
y.backward() # Compute gradient dy/dx
print(x.grad) # dy/dx = 2x + 2 → 2*3 + 2 = 8.0
4. Jacobian Matrix
Definition: A matrix of all first-order partial derivatives of a vector-valued function.
Example Function: \(\(f(x,y) = [x^2 + 3y, 5x + y^3] \)\)
- Jacobian ( J ) has shape (2, 2):
\(\[ J = \begin{bmatrix} \frac{\partial f_1}{\partial x} & \frac{\partial f_1}{\partial y} \\ \frac{\partial f_2}{\partial x} & \frac{\partial f_2}{\partial y} \end{bmatrix} = \begin{bmatrix} 2x & 3 \\ 5 & 3y^2 \end{bmatrix} \]\) —
Mini Exercises
1. Reshape a Tensor
Convert a 1D tensor [1, 2, 3, 4, 5, 6]
into a 3x2 matrix.
tensor = np.array([1, 2, 3, 4, 5, 6])
reshaped = tensor.reshape(3, 2)
- Compute Gradients
Use PyTorch to compute the derivative of ( y = 2x^3 + \sin(x) ) at ( x = \pi ).x = torch.tensor(np.pi, requires_grad=True) y = 2 * x ** 3 + torch.sin(x) y.backward() print(x.grad) # dy/dx = 6x² + cos(x) ≈ 6*(9.87) + (-1) ≈ 58.2
Project Walkthrough: Compute the Jacobian Matrix
Step 1: Define the Function
Compute ( f(x, y) = x^2 + 3y ).
def f(x, y):
return x**2 + 3*y
Step 2: Compute Partial Derivatives Manually
- ( \frac{\partial f}{\partial x} = 2x )
- ( \frac{\partial f}{\partial y} = 3 )
Step 3: Code the Jacobian
import numpy as np
def jacobian(x, y):
df_dx = 2 * x
df_dy = 3
return np.array([[df_dx, df_dy]])
# Example at (x=2, y=4)
print(jacobian(2, 4)) # Output: [[4, 3]]
Step 4: Visualize with NumPy
x_vals = np.linspace(-2, 2, 100)
y_vals = np.linspace(-2, 2, 100)
X, Y = np.meshgrid(x_vals, y_vals)
Z = f(X, Y)
# Plot in Jupyter Notebook
import matplotlib.pyplot as plt
plt.contourf(X, Y, Z, levels=20)
plt.colorbar()
plt.title("f(x, y) = x² + 3y")
plt.xlabel("x")
plt.ylabel("y")
plt.show()
Questions
- What is the rank of a 3x2x4 tensor?
- What does broadcasting allow you to do?
- Compute the gradient of \(\( y = 3x^2 \) at \( x = 2 \).\)
- What is the Jacobian matrix for ( f(x, y) = [x + y, xy] )?
Dictionary
- Tensor: A multi-dimensional array of numbers.
- Automatic Differentiation: Automatically computes gradients for optimization.
- Gradient: A vector of partial derivatives (slopes) of a function.
- Jacobian Matrix: A matrix of all first-order partial derivatives of a vector function.
- Reshaping: Changing the dimensions of a tensor without changing its data.
Resources
- NumPy Tutorial: NumPy Quickstart
- PyTorch Autograd: PyTorch Autograd Guide
- Visualizing Tensors: 3Blue1Brown Essence of Linear Algebra
Tips
- Use
torch.autograd.functional.jacobian
in PyTorch to compute Jacobians automatically for complex functions. - Debug shape errors by printing tensor shapes (
print(tensor.shape)
).
Answers to questions above
- Rank 3 (3 dimensions).
- Perform arithmetic on tensors of different shapes by expanding them.
- ( dy/dx = 6x ). At ( x=2 ), gradient = 12.
- ( J = \begin{bmatrix} 1 & 1 \ y & x \end{bmatrix} ).
Credits
-
Deepseek. ↩