Empirical risk minimization

Given a training set $T_{m}$ , learn a predictor that minimizes the expected risk.

However, we only have the empirical risk. So we gather enough samples, and find a classifier, that minimizes empirical risk.

Number of samples needed for a given loss function and training set can be estimated using Hoeffding inequality.

However, ERM does not tell us if the used method for a given problem is correct. The theory only gives us hope, that using empirical risk as a substitute for true risk is enough. However, if the true risk is high (= method is not suitable), we can train how much we want, but we’ll never train a good predictor by simply training ERM. Overfitting = empirical risk is much lower than true risk.

Takeaway - domain knowledge is important when selecting a learner.

Vojtěch Tóth

Explorer

Empirical risk minimization

Graph View