SAN

Bayes classificator

Plugging gaussian distro into bayes formula and taking logarithm leads to discriminant score Apply softmax to go from disc scores back to probability P(Y = k | X = x)

Evaluation of classifiers

Confusion matrix

  • easy
  • Bad for skewed and imbalanced distributions

RECALL = Accuracy, Specificity, Sensitivity

Sensitivity = TP / P 1 - Specificity = FP / N

Parameters Threshold -

ROC

  • receiving operator characteristics 1 is good Steep is good

Classifiers comparasion

NameAssumptionsKGood when
Logistic2
Naive bayesHigh dimension
LDAreasonable gaussian
QDA
kNNnone, robust

I’ve got that poisson

A ha! That poisson on my mind

probability something happens k times over fixed period of time or space

  • Probability I’ll get fucked k times in a month (close to 0)

Only one parameter, nice

Generalized linear models

  • Linear predictor + Link function + particular distribution

Party killer = particular

Exponential distribution family