Students in my Stanford courses on machine learning have already made several useful suggestions, as have my colleague, Pat Langley, and my teaching

Coefficient of determination The coefficient of determination, often noted $R^2$ or $r^2$, provides a measure of how well the observed outcomes are replicated by the model and is defined as follows: Main metrics The following metrics are commonly used to assess the performance of regression models, by taking into account the number of variables $n$ that they take into consideration: where $L$ is the likelihood and $\widehat{\sigma}^2$ is an estimate of the variance associated with each response. In a context of a binary classification, here are the main metrics that are important to track in order to assess the performance of the model. Hardware Accelerators for Machine Learning (CS 217) Stanford University, Winter 2020 Networks Network Architectures Architectural Components/Motifs Regularization in Neural Networks Learning Ideas Datasets Contests Personalities Teams Tasks Events.

The different types are summed up in the table below: The most commonly used method is called $k$-fold cross-validation and splits the training data into $k$ folds to validate the model on one fold while training the model on the $k-1$ other folds, all of this $k$ times. Architecture― The vocabulary around neural networks architectures is described in the figure below: By noting $i$ the $i^{th}$ layer of the network and $j$ the $j^{th}$ hidden unit of the layer, we have: where we note $w$, $b$, $z$ the weight, bias and output respectively.

Error analysis Error analysis is analyzing the root cause of the difference in performance between the current and the perfect models. Warning: This document is under early stage development.

Word2vec Word2vec is a framework aimed at learning word embeddings by estimating the likelihood that a given word is surrounded by other words. You can help us, $\displaystyle\frac{\textrm{TP}+\textrm{TN}}{\textrm{TP}+\textrm{TN}+\textrm{FP}+\textrm{FN}}$, $\displaystyle\frac{\textrm{TP}}{\textrm{TP}+\textrm{FP}}$, How accurate the positive predictions are, $\displaystyle\frac{\textrm{TP}}{\textrm{TP}+\textrm{FN}}$, $\displaystyle\frac{\textrm{TN}}{\textrm{TN}+\textrm{FP}}$, $\displaystyle\frac{2\textrm{TP}}{2\textrm{TP}+\textrm{FP}+\textrm{FN}}$, Hybrid metric useful for unbalanced classes, $\displaystyle\frac{\textrm{FP}}{\textrm{TN}+\textrm{FP}}$, $\displaystyle\textrm{SS}_{\textrm{tot}}=\sum_{i=1}^m(y_i-\overline{y})^2$, $\displaystyle\textrm{SS}_{\textrm{reg}}=\sum_{i=1}^m(f(x_i)-\overline{y})^2$, $\displaystyle\textrm{SS}_{\textrm{res}}=\sum_{i=1}^m(y_i-f(x_i))^2$, $\displaystyle\frac{\textrm{SS}_{\textrm{res}}+2(n+1)\widehat{\sigma}^2}{m}$, $\displaystyle1-\frac{(1-R^2)(m-1)}{m-n-1}$, ⢠Training on $k-1$ folds and assessment on the remaining one, ⢠Training on $n-p$ observations and assessment on the $p$ remaining ones, Tradeoff between variable selection and small coefficients, $...+\lambda\Big[(1-\alpha)||\theta||_1+\alpha||\theta||_2^2\Big]$, ⢠Training error slightly lower than test error. Vocabulary When selecting a model, we distinguish 3 different parts of the data that we have as follows: Once the model has been chosen, it is trained on the entire dataset and tested on the unseen test set.

Bias/variance tradeoff The simpler the model, the higher the bias, and the more complex the model, the higher the variance.

Unsupervised learning is a type of machine learning that looks for previously undetected patterns in a data set with no pre-existing labels and with a minimum of human supervision. Commonly used types of neural networks include convolutional and recurrent neural networks.

Bias and Variance in Machine Learning …

Theano is the powerful deep learning library in python and this Cheat Sheet includes the most common ways to implement high-level neural networks API to develop and evaluate machine learning models.