I've been using F-scores when validating binary classifiers for imbalanced data set. Recently, I've just found out that using AUROC could make life a bit easier. If I use AUROC, I don't have to find optimal threshold for every new classifiers. The only time I optimize threshold is with the final final classifier.
- F-score: a score for a given threshold
- AUROC: a score for varying threshold
For a classifier with a given AUROC, there could be many different F-scores as threshold varies.
So,
first, find the classifier with the largest AUROC,
and then, find the threshold that yields the largest F-score.
- Cross Validated: http://stats.stackexchange.com/questions/7207/roc-vs-precision-and-recall-curves
댓글 없음:
댓글 쓰기