- Overfitting :
- Model also memorises / trains on noise that resides within training data.
- Model performs well when evaluating on training data but does not perform well on unseen data
- High variance is responsible for this error because of also capturing noise.
- Diagnosis: cross-val prediction on test set has high error than prediction on train set
- Possible remedy : Decrease model complexity, gather more data,
- Underfitting :
- Model is too simple to catch the pattern, model is not good enough to capture the underlying pattern.
- Model is bad on both training and unseen data
- Model is not flexibple enough to approximate the prediction values
- High bias is responsible for this error
- Diagnosis: cross-val prediction on train and test set are roughly equal but have very high errors that is undesirable
- Possible remedy : Increase model complexity, gather more features,
- Bias-Variance trade-off :
- Generalization error = bias^2 + variance + irreducable error (noise)
- bias = error term that tells how on average real value is different from predicted value
- variance = error term that tells how predicted value varies over different training sets
- When model complexity increases, variance increases and bias decreases
- When model complexity decreases, variance decreases and bias increases
- The sweet spot is the minimised generalization error, which gives the optimised model