《模式識別與智慧計算》基於類中心的歐式距離分類法
- 2020 年 1 月 16 日
- 筆記
演算法流程
- 選取某一類樣本X
- 計算樣本類中心
- 採用歐式距離測度計算待測樣品到類中心的距離
- 距離最小的就是待測樣品的類別
演算法實現
計算距離
def euclid(x_train,y_train,sample): """ :function: 基於類中心的模板匹配法 :param x_train:訓練集 M*N M為樣本個數 N為特徵個數 :param y_train:訓練集標籤 1*M :param sample: 待識別樣品 :return: 返回判斷類別 """ disMin = np.inf label = 0 #去除標籤重複元素 target = np.unique(y_train) for i in target: #將同一類別的數據下標集中到一起 trainId =([j for j,y in enumerate(y_train) if y==i]) train = x_train[trainId,:] trainMean = np.mean(train, axis=0) dis = np.dot((sample-trainMean),(sample - trainMean).T) if(disMin>dis): disMin = dis label = i return label
劃分數據集
def train_test_split(x,y,ratio = 3): """ :function: 對數據集劃分為訓練集、測試集 :param x: m*n維 m表示數據個數 n表示特徵個數 :param y: 標籤 :param ratio: 產生比例 train:test = 3:1(默認比例) :return: x_train y_train x_test y_test """ n_samples , n_train = x.shape[0] , int(x.shape[0]*(ratio)/(1+ratio)) train_id = random.sample(range(0,n_samples),n_train) x_train = x[train_id,:] y_train = y[train_id] x_test = np.delete(x,train_id,axis = 0) y_test = np.delete(y,train_id,axis = 0) return x_train,y_train,x_test,y_test
測試
from sklearn import datasets from Include.chapter3 import function import numpy as np #讀取數據 digits = datasets.load_digits() x , y = digits.data,digits.target #劃分數據集 x_train, y_train, x_test, y_test = function.train_test_split(x,y) testId = np.random.randint(0, x_test.shape[0]) sample = x_test[testId, :] ans = function.euclid(x_train,y_train,sample) print(ans==y_test[testId])
演算法結果
True