The Confusion Matrix in data mining is used to explain Type I and a Type II errors from your results. These results are also referred to as false positives and false negatives.
A false positive is when something is predicted to occur but does not occur. A false negative is when something is predicted to not occur, but it does occur.
The common notation is:
- y for the actual values
- y^ for predicted values
A confusion matrix in data mining can give a quick overview of how the prediction model has performed. It is used to see accuracy in Logistic Regression and K-Nearest Neighbor classification models, for example.
In the example above, the prediction model accurately predicted 35 events that did not occur. And it accurately predicted 50 events that did not occur. The test set in this example has 100 events. From this, finding the accuracy or error rate is quite simple.
So, don’t let the name confuse you!