The KNN Classifier Algorithm (K-Nearest Neighbor) is straight forward and not difficult to understand. Assume you have a dataset and have already identified two categories of data from the set. For the sake of simplicity, let’s say the dataset is represented by two columns of data, X1 and X2. The question is, how do you determine which category a new data point would belong to? Does it best fit the first category, or the second?
This is where the K-Nearest Neighbor algorithm comes in. It will help us classify the new data point.
KNN Classifier Algorithm Steps
Typically, the number of neighbors chosen is 5. And the euclidean distance formula is mostly used. Other numbers of neighbors can be used, and a different distance formula can be used. It’s up the person to decide on how they want the model built.
As you can see in our example; the new data point is closer with two points in the green category, and with three points in the red category. We have exhausted our number of neighbors of five that we set for the algorithm, so we classify the new data point in the red category.
While the K-Nearest Neighbor Algorithm is based on a simple concept, it can can model some surprising accurate predictions.