Strong mathematical foundations are the difference between using ML tools and truly understanding them.
You don't need a PhD, but these areas matter:
Linear Algebra:
- Vectors, matrices, matrix multiplication
- Eigenvalues/eigenvectors (PCA, SVD)
- Dot products, norms, projections
- Tensors (multi-dimensional arrays)
Calculus:
- Derivatives, partial derivatives, chain rule
- Gradient descent — the backbone of all neural network training
- Jacobians and Hessians for second-order optimisers
Probability & Statistics:
- Probability distributions (Gaussian, Bernoulli, Categorical)
- Bayes' theorem, conditional probability
- Maximum Likelihood Estimation (MLE)
- Expected value, variance, covariance
- Hypothesis testing, p-values, confidence intervals
Information Theory:
- Entropy, cross-entropy loss
- KL divergence (used in VAEs, RL)
Resources:
- 3Blue1Brown (visual linear algebra)
- StatQuest with Josh Starmer
- Mathematics for Machine Learning (book, free PDF)
Linear Algebra:
- Vectors, matrices, matrix multiplication
- Eigenvalues/eigenvectors (PCA, SVD)
- Dot products, norms, projections
- Tensors (multi-dimensional arrays)
Calculus:
- Derivatives, partial derivatives, chain rule
- Gradient descent — the backbone of all neural network training
- Jacobians and Hessians for second-order optimisers
Probability & Statistics:
- Probability distributions (Gaussian, Bernoulli, Categorical)
- Bayes' theorem, conditional probability
- Maximum Likelihood Estimation (MLE)
- Expected value, variance, covariance
- Hypothesis testing, p-values, confidence intervals
Information Theory:
- Entropy, cross-entropy loss
- KL divergence (used in VAEs, RL)
Resources:
- 3Blue1Brown (visual linear algebra)
- StatQuest with Josh Starmer
- Mathematics for Machine Learning (book, free PDF)