Neural networks, backpropagation, and the architectures that power modern AI.
Deep learning is the engine behind LLMs, image recognition, speech synthesis, and more.
Foundations:
- Perceptron → Multi-layer perceptron (MLP)
- Activation functions: ReLU, GELU, Sigmoid, Tanh, Softmax
- Backpropagation & gradient flow
- Loss functions: MSE, Cross-Entropy, BCE, Huber
- Optimisers: SGD, Adam, AdamW, Lion
- Regularisation: Dropout, L1/L2, Batch Normalisation, Weight Decay
- Learning rate schedulers: cosine annealing, warmup, ReduceLROnPlateau
CNN (Convolutional Neural Networks):
- Convolution, pooling, stride, padding
- ResNet, VGG, EfficientNet architectures
- Transfer learning & fine-tuning
RNN / LSTM / GRU:
- Sequential data, vanishing gradient problem
- LSTM gates (forget, input, output)
- Applications: time series, NLP before transformers
Transformers:
- Self-attention mechanism
- Multi-head attention
- Positional encodings
- Encoder-decoder architecture
- BERT, GPT family
Foundations:
- Perceptron → Multi-layer perceptron (MLP)
- Activation functions: ReLU, GELU, Sigmoid, Tanh, Softmax
- Backpropagation & gradient flow
- Loss functions: MSE, Cross-Entropy, BCE, Huber
- Optimisers: SGD, Adam, AdamW, Lion
- Regularisation: Dropout, L1/L2, Batch Normalisation, Weight Decay
- Learning rate schedulers: cosine annealing, warmup, ReduceLROnPlateau
CNN (Convolutional Neural Networks):
- Convolution, pooling, stride, padding
- ResNet, VGG, EfficientNet architectures
- Transfer learning & fine-tuning
RNN / LSTM / GRU:
- Sequential data, vanishing gradient problem
- LSTM gates (forget, input, output)
- Applications: time series, NLP before transformers
Transformers:
- Self-attention mechanism
- Multi-head attention
- Positional encodings
- Encoder-decoder architecture
- BERT, GPT family