K-Nearest-Neighbor-Algorithm

K-Nearest Neighbors Algorithm in Machine Learning

K-Nearest Neighbors algorithm (KNN)

K-Nearest Neighbors algorithm (KNN) is a very important supervised machine learning algorithm and one should start from this algorithm. It is easy to understand compare to other algorithms and does not involve complex mathematical concepts.

In this post, I will explain k-Nearest Neighbors algorithm using Irish flowers data set.

Sources- https://amybeisel.medium.com/knn-k-nearest-neighbors-classifier-from-scratch-2087561010dc

 

K-Nearest Neighbours Unknown Label

From the above two, one figure and table  you can understand  Irish dataset which has four features sepal length, sepal width, petal length and petal width.

Iris-Flowers Dataset

In original dataset there are 150 instances which has four features  sepal length, sepal width, petal length and petal width and three features Iris-setosa, Iris-versicolor and Irish-virginica. For the sake of simplicity, I have only take only ten instances in which  nine are labeled  and one is not labeled. Nine instances will be used for training and one will be used for testing.

Working of K-Nearest Neighbors  Algorithm

From the data set you can see nine instances having labels and the tenth  has not a label. To train the model, distance will be calculated from 10 instance  to all instances, and the value of k should be chosen.

 

KNN Algorithm

If we take k=3, you see the minimum three distances are 0.424 (Iris-Setosa), 0.141(Iris-Setosa) and again 0.721(Iris-Setosa). So, if k=3 majority of Iris-Setosa instances are nearest to instance number ten.

So, instance 10 will have label Iris-Setosa because 3 Iris-Setosa instances are nearest to instance  number 10.

 

K Nearest Neighbors Algorithm

Python’s Code for K-Nearest Neighbors Algorithm


from sklearn.neighbors import KNeighborsClassifier
i1=[4.6,3.4,1.4,0.3]
i2=[5,3.4,1.5,0.2]
i3=[4.4,2.9,1.4,0.2]
i4=[7,3.2,4.7,1.4]
i5=[6.4,3.2,4.5,1.5]
i6=[6.9,3.1,4.9,1.5]
i7=[6.3,3.3,6,2.5]
i8=[5.8,2.7,5.1,1.9]
i9=[7.1,3,5.9,2.1]
X_train=[i1,i2,i3, i4,i5,i6, i7,i8,i9]
irst, irver,irvirg ="Iris-setosa", "Iris-versicolor", "Iris-virginica"
y_train=[irst,irst ,irst ,irver , irver,irver,irvirg,irvirg,irvirg]
modal= KNeighborsClassifier(n_neighbors =3, p=2)
modal.fit(X_train, y_train)
X_pred=[[5,3.3,1.4,0.2]]
Unknown= modal.predict(X_pred)
print("Predicted=",Unknown)

In the above code from
sklearn.neighbors import KNeighborsClassifier imports KNeighborsClassifier class which is used to create instance of KNN model. Nine lists i1,i2,…,i9 are created which are instances without labels.
Therefore, X_train nested list is created which is basically a table having all features except tenth. Furthermore, y_train has all labels corresponding to X_train instances. KNeighborsClassifier(n_neighbors =3, p=2) means number of neighbors are 3 and p=2 means Euclidean distance if the value of p=1 it is manhattan distance.
Furthermore, fit() function takes two arguments X_train, y_train and it will train the model. predict() function will predict the label. Finally, print function will print the label.

Watch the video :

Leave a Comment

Your email address will not be published. Required fields are marked *

©Postnetwork-All rights reserved.