Implementing MSE

Mean square error is defined as

MSE (Y, \hat{Y}) = \frac{1}{n} i = 1 \sum n (Y_{i} - \hat{Y}_{i})^{2},

where both $Y$ and $\hat{Y}$ is a 1D array of values. The loss is always positive, the larger the difference from truth, the bigger the penalty is and it’s easily differentiable.

We are interested in the derivative given the prediction $\hat{Y}$ . The derivative for the $k$ -th label is

\frac{\partial MSE ( Y , Y ^ )}{\partial Y ^ _{k}} = - \frac{2}{n} (Y_{k} - \hat{Y}_{k}) .

Therefore, the gradient is

\nabla MSE (Y, \hat{Y}) = - \frac{2}{n} (Y - \hat{Y}) = \frac{2}{n} (\hat{Y} - Y) .

Some people like to omit the $\frac{2}{n}$ . While optimizing, the factor doesn’t really matter - meaning the optimum stays the same.

Implementation

Let’s jump directly into the code

import numpy as np
 
class MSELoss():
	def forward(self, y, y_pred):
		assert len(y.shape) == 1 and len(y_pred.shape) == 1, "Not a 1D array."
		assert y.shape == y_pred.shape, "Dimension mismatch"
		return np.mean(np.power(y - y_pred, 2))
	
	def backward(self, y, y_pred):
		assert len(y.shape) == 1 and len(y_pred.shape) == 1, "Not a 1D array."
		assert y.shape == y_pred.shape, "Dimension mismatch"
		n = y.shape[0]
		return (2.0 / n) * (y_pred - y)

Vojtěch Tóth

Explorer

Implementing MSE

Implementation

Graph View