ML Glossary
Basics
Linear Regression
Introduction
Simple regression
Making predictions
Cost function
Gradient descent
Training
Model evaluation
Summary
Multivariable regression
Growing complexity
Normalization
Making predictions
Initialize weights
Cost function
Gradient descent
Simplifying with matrices
Bias term
Model evaluation
Gradient Descent
Introduction
Learning rate
Cost function
Step-by-step
Logistic Regression
Introduction
Comparison to linear regression
Types of logistic regression
Binary logistic regression
Sigmoid activation
Decision boundary
Making predictions
Cost function
Gradient descent
Mapping probabilities to classes
Training
Model evaluation
Multiclass logistic regression
Procedure
Softmax activation
Scikit-Learn example
Glossary
Math
Calculus
Introduction
Derivatives
Geometric definition
Taking the derivative
Step-by-step
Machine learning use cases
Chain rule
How It Works
Step-by-step
Multiple functions
Gradients
Partial derivatives
Step-by-step
Directional derivatives
Useful properties
Integrals
Computing integrals
Applications of integration
Computing probabilities
Expected value
Variance
Linear Algebra
Vectors
Notation
Vectors in geometry
Scalar operations
Elementwise operations
Dot product
Hadamard product
Vector fields
Matrices
Dimensions
Scalar operations
Elementwise operations
Hadamard product
Matrix transpose
Matrix multiplication
Test yourself
Numpy
Dot product
Broadcasting
Probability (TODO)
Links
Screenshots
License
Statistics (TODO)
Notation
Algebra
Calculus
Linear algebra
Probability
Set theory
Statistics
Neural Networks
Concepts
Neural Network
Neuron
Synapse
Weights
Bias
Layers
Weighted Input
Activation Functions
Loss Functions
Optimization Algorithms
Gradient Accumulation
Forwardpropagation
Simple Network
Steps
Code
Larger Network
Architecture
Weight Initialization
Bias Terms
Working with Matrices
Dynamic Resizing
Refactoring Our Code
Final Result
Backpropagation
Chain rule refresher
Applying the chain rule
Saving work with memoization
Code example
Activation Functions
Linear
ELU
ReLU
LeakyReLU
Sigmoid
Tanh
Softmax
Layers
BatchNorm
Convolution
Dropout
Pooling
Fully-connected/Linear
RNN
GRU
LSTM
Loss Functions
Cross-Entropy
Hinge
Huber
Kullback-Leibler
RMSE
MAE (L1)
MSE (L2)
Optimizers
Adagrad
Adadelta
Adam
Conjugate Gradients
BFGS
Momentum
Nesterov Momentum
Newton’s Method
RMSProp
SGD
Regularization
Data Augmentation
Dropout
Early Stopping
Ensembling
Injecting Noise
L1 Regularization
L2 Regularization
Architectures
Autoencoder
CNN
GAN
MLP
RNN
VAE
Algorithms (TODO)
Classification
Bayesian
Decision Trees
K-Nearest Neighbor
Logistic Regression
Random Forests
Boosting
Support Vector Machine
Clustering
Centroid
Density
Distribution
Hierarchical
K-Means
Mean shift
Regression
Ordinary Least Squares
Polynomial
Lasso
Ridge
Stepwise
Reinforcement Learning
Note on Terminology
Eploration vs. Exploitation
MDPs and Tabular methods
Monte Carlo methods
Temporal-Difference Learning
Planning
On-Policy vs. Off-Policy Learning
Model-Free vs. Model-Based Approaches
Imitation Learning
Q-Learning
Deep Q-Learning
Examples of Applications
Links
Resources
Datasets
Libraries
Papers
Other
Contributing
How to contribute
ML Glossary
Docs
»
Index
Edit on GitHub
Index
Read the Docs
v: latest
Versions
latest
Downloads
pdf
html
epub
On Read the Docs
Project Home
Builds
Free document hosting provided by
Read the Docs
.