ML resources 📘
Current Learning resources
Reading list
Math for ML books
- “Machine Learning: A Probabilistic Perspective” by Kevin P. Murphy
- “The Elements of Statistical Learning” by Trevor Hastie, Robert Tibshirani, and Jerome Friedman
- “Pattern Recognition and Machine Learning” by Christopher Bishop
- “Mathematics for Machine Learning” by Brian D. Irving
- “An Introduction to Machine Learning with Applications in Engineering” by Andrew M. Moore
Linear alg
Linear Algebra plays a fundamental role in Machine Learning (ML) and Deep Learning (DL), providing essential mathematical foundations for various algorithms and models.
-
Vectors and Matrices: A vector is an ordered sequence of numbers, typically real or complex, arranged in a single column or row. In ML/DL, vectors often represent data points or features. A matrix, on the other hand, is a two-dimensional array of numbers that can be thought of as a collection of vectors or rows and columns. Matrices can be used to represent linear transformations, coefficients in linear models, or weights in neural networks.
-
Vector Operations: a) Addition and Subtraction: Two vectors of the same size can be added or subtracted element-wise. This operation is fundamental for data preprocessing and feature manipulation. b) Scalar Multiplication: Multiplying a vector by a scalar results in the same numbers multiplied by that scalar. c) Dot Product (Inner Product): Given two vectors, the dot product computes the sum of the products of corresponding elements. It is used to measure similarity between vectors and calculate the angle between them.
-
Matrix Operations: a) Addition and Subtraction: Two matrices of the same size can be added or subtracted element-wise. This operation is used to create new feature representations or combine multiple matrices. b) Transpose: The transpose of a matrix swaps its rows and columns. In ML/DL, transposition is often used for data representation changes or when dealing with symmetry in covariance matrices. c) Matrix Multiplication: Multiplying two matrices results in a new matrix formed by summing the products of corresponding elements in each row of the first matrix and each column of the second matrix. It’s crucial for various ML/DL algorithms, such as neural networks and linear regression.
-
Linear Systems and Solving Linear Equations: A system of linear equations represents a set of equations where every equation is a linear combination of variables. In ML/DL, these systems often appear in the form of overdetermined or underdetermined systems that need to be solved for model coefficients. Techniques like Gauss-Jordan elimination, matrix inversion, and QR decomposition are used to find solutions.
-
Eigenvalues and Eigenvectors: For a square matrix A, the eigenvalues λ and corresponding eigenvectors x satisfy Ax = λx. In ML/DL, eigenvalues and eigenvectors provide important insights into the underlying structure of data and can be used to find principal components in Principal Component Analysis (PCA), or solve systems with ill-conditioned matrices using Singular Value Decomposition (SVD).
-
Determinants: The determinant of a square matrix is a scalar value that describes essential properties such as volume scaling and orientation preservation/reflection in linear transformations. In ML, it can be used to calculate the absolute value of the Jacobian for change-of-basis calculations.
-
Inverses: The inverse of a square matrix A, denoted as A⁻¹, is another square matrix that satisfies AA⁻¹ = A⁻¹A = I (the identity matrix). In ML/DL, the inverse of a matrix is used to solve linear systems Ax = b or find the weights in Bayesian inference.
-
Norms and Distances: The norms of vectors measure their magnitudes, while distances between vectors provide a measure of similarity. Commonly used norms include Euclidean, Manhattan, and Chebyshev distances, which are used for various tasks like clustering, dimensionality reduction, and similarity search.
Questions from Hands on ML
-
How would you define Machine Learning?
-
Can you name four types of problems where it shines?
-
What is a labeled training set?
-
What are the two most common supervised tasks?
-
Can you name four common unsupervised tasks?
-
What type of Machine Learning algorithm would you use to allow a robot to walk in various unknown terrains?
-
What type of algorithm would you use to segment your customers into multiple groups?
-
Would you frame the problem of spam detection as a supervised learning problem or an unsupervised learning problem?
-
What is an online learning system?
-
What is out-of-core learning?
-
What type of learning algorithm relies on a similarity measure to make predictions?
-
What is the difference between a model parameter and a learning algorithm’s hyperparameter?
-
What do model-based learning algorithms search for? What is the most common strategy they use to succeed? How do they make predictions?
-
Can you name four of the main challenges in Machine Learning?
-
If your model performs great on the training data but generalizes poorly to new instances, what is happening? Can you name three possible solutions?
-
What is a test set, and why would you want to use it?
-
What is the purpose of a validation set?
-
What is the train-dev set, when do you need it, and how do you use it?
-
What can go wrong if you tune hyperparameters using the test set?