Before deep learning — understand the algorithms that still power much of ML in production.
Scikit-learn is the go-to library for classical ML in Python.
Supervised Learning:
- Linear Regression, Ridge, Lasso
- Logistic Regression
- Decision Trees, Random Forests, Gradient Boosting (XGBoost, LightGBM)
- Support Vector Machines (SVM)
- K-Nearest Neighbours (KNN)
- Naive Bayes
Unsupervised Learning:
- K-Means Clustering
- DBSCAN
- PCA (dimensionality reduction)
- t-SNE, UMAP (visualisation)
Model Evaluation:
- Train/validation/test split
- Cross-validation (k-fold, stratified)
- Metrics: accuracy, precision, recall, F1, AUC-ROC, RMSE, MAE
- Confusion matrix
- Bias-variance tradeoff
Feature Engineering:
- Normalisation, standardisation, encoding categoricals
- Feature selection: correlation, mutual information, feature importance
- Handling missing values and outliers
Hyperparameter Tuning:
- Grid search, random search
- Bayesian optimisation (Optuna)
Supervised Learning:
- Linear Regression, Ridge, Lasso
- Logistic Regression
- Decision Trees, Random Forests, Gradient Boosting (XGBoost, LightGBM)
- Support Vector Machines (SVM)
- K-Nearest Neighbours (KNN)
- Naive Bayes
Unsupervised Learning:
- K-Means Clustering
- DBSCAN
- PCA (dimensionality reduction)
- t-SNE, UMAP (visualisation)
Model Evaluation:
- Train/validation/test split
- Cross-validation (k-fold, stratified)
- Metrics: accuracy, precision, recall, F1, AUC-ROC, RMSE, MAE
- Confusion matrix
- Bias-variance tradeoff
Feature Engineering:
- Normalisation, standardisation, encoding categoricals
- Feature selection: correlation, mutual information, feature importance
- Handling missing values and outliers
Hyperparameter Tuning:
- Grid search, random search
- Bayesian optimisation (Optuna)