先验知识
numpy.argsort(a, axis=-1, kind=’quicksort’, order=None) 返回的是数组值从小到大的索引值 参数: a为要排序的数组 axis:按哪一维进行排序 kind:排序算法的选择,有quicksort,mergesort,heapsort对于一维数组
>>>import numpy as np >>>x=np.array([1,4,3,-1,5,9]) >>>x.argsort() array([3,0,2,1,4,5)]numpy.tile(array, (dim))
把array的维度扩充和dim一样,dim是一个元组
k-近邻算法大致流程
dataSetSize = dataSet.shape[0] diffMat = np.tile(inX, (dataSetSize, 1)) - dataSet sqDiffMat = diffMat**2 sqDistances = sqDiffMat.sum(axis=1) distances = sqDistances**0.5 sortedDistIndicies = distances.argsort() classCount = {} for i in range(k): voteIlabel = labels[sortedDistIndicies[i]] classCount[voteIlabel] = classCount.get(voteIlabel, 0) + 1 sortedClassCount = sorted(classCount.items(), key=operator.itemgetter(1), reverse=True) return sortedClassCount[0][0]
