-
0.6.6 – Computational graph with forward pass values and gradients
Q: Write the computational graph of a softmax classifier consisting of one linear transformation, the softmax activation and the log loss function. Show all forward pass calculations and the calculated gradients. A:
-
0.6.5 – Neural Net Layer Expressions
Q: Write out the following neural network with as few weights as possible: A:
-
0.6.4 – Unit I: Ax = b and the Four Subspaces – Exam 1
Solving problem 1a) of Exam 1 from the MIT course “18.06SC – Linear Algebra”.
-
0.6.3 – The Four Fundamental Subspaces
Solving the problem from the 10. recitation video from the MIT course “18.06SC – Linear Algebra”.
-
0.6.2 – The Four Fundamental Subspaces
Solving problem 10.2 from the MIT course “18.06SC – Linear Algebra”.
-
0.6.1 – The Four Fundamental Subspaces
Solving problem 10.1 from the MIT course “18.06SC – Linear Algebra”.
-
0.6.0 – Basic Neural Network Feature Space Topologies
One caveat with the binary and multi-class logistic regression is that it only able to classify the data if the data is linearly separable: If the data points are non-linearly separable, binary/multi-class logistic regression will only be able to find the best linear line or hyperplane to separate the data points, which is suboptimal solution…
-
0.5.6 – Independence, Basis and Dimension
Solving the problem from the 9. recitation video from the MIT course “18.06SC – Linear Algebra”.
-
0.5.5 – Independence, Basis and Dimension
Solving problem 9.2 from the MIT course “18.06SC – Linear Algebra”.
-
0.5.4 – Independence, Basis and Dimension
Solving problem 9.1 from the MIT course “18.06SC – Linear Algebra”.
-
0.5.3 – Solving Ax = b: Row Reduced Form R
Solving the problem from the 8. recitation video from the MIT course “18.06SC – Linear Algebra”.
-
0.5.2 – Solving Ax = b: Row Reduced Form R
Solving problem 8.3 from the MIT course “18.06SC – Linear Algebra”.
-
0.5.1 – Solving Ax = b: Row Reduced Form R
Solving problem 8.2 from the MIT course “18.06SC – Linear Algebra”.
-
0.5.0 – Regularized Linear Regression
In order to mitigate overfitting our linear model we can add one or more regularization terms to our loss function. The regularization terms incentivize lower the weights in the coefficient/weight matrices. Why does one want to avoid large coefficients? When you have large weights in you model this lead to high variance. Imagine having the…
-
0.4.3 – Solving Ax = b: Row Reduced Form R
Solving problem 8.1 from the MIT course “18.06SC – Linear Algebra”.
-
0.4.2 – Solving Ax = 0: Pivot Variables, Special Solutions
Solving the problem from the 8. recitation video from the MIT course “18.06SC – Linear Algebra”.
-
0.4.1 – Solving Ax = 0: Pivot Variables, Special Solutions
Solving problem 7.2 from the MIT course “18.06SC – Linear Algebra”.
-
0.4.0 – Multi-class Logistic Regression
Often we want to predict the label/class of a data point where there are more than two labels. Taking a look at binary linear regression, we see it is only created for when there are only two labels to choose from. From binary logistic regression we have: When there only two possible outcomes we can…
-
0.3.0 – Binary Logistic Regression
Sometimes it useful to predict a class given input data, unlike in linear and polynomial regression where you predict a number given input data. To predict binary class labels (either 1 or 0) one can use logistic regression. Its worth noting that while polynomial regression is kind of a extension of linear regression, logistic regression…
-
0.2.5 – Solving Ax = 0: Pivot Variables, Special Solutions
Solving problem 7.1 from the MIT course “18.06SC – Linear Algebra”.