《模式識別與智能計算》基於PCA的模板匹配法

  • 2020 年 1 月 16 日
  • 筆記

算法流程:
  1. 選取各類全體樣本組成矩陣X,待測樣品
  2. 計算協方差矩陣S
  3. 根據S的特徵值選取適合的矩陣C
  4. 使用矩陣C降維
  5. 採用模板匹配開始多類別分類
算法實現

PCA降維算法

def pca(x,k=0,percent = 0.9):      """      :function: 主成分分析法      :param X: 數據X  m*n維  n表示特徵個數,m表示數據個數      :param K: K表是要保留的維度      :param percent: 樣本所佔比例      :return: 返回特徵向量      """      m,n = x.shape      mean = np.mean(x,axis=0)      mean.shape = (1,n)      x_norm = x - mean      x_norm = x_norm.T  # 將它變成 行列分別為特徵的矩陣 便於計算!!!      cov = np.dot(x_norm, x_norm.T)      eigval, eigvec = np.linalg.eig(cov)      index = np.argsort(-eigval)      eigvec_sort = eigvec[index]      eigval_sort = eigval[index]      eigval_ratio = eigval_sort/np.sum(eigval_sort)      sum = 0      for i in range(eigval_ratio.shape[0]):          sum += eigval_ratio[i]          if sum > percent:              return eigvec_sort[:,:i+1]

模板匹配算法

def neartemplet(x_train,y_train,sample):      """      :function: 模板匹配法      :param X_train: 訓練集 M*N  M為樣本個數 N為特徵個數      :param y_train: 訓練集標籤 1*M      :param sample: 待識別樣品      :return: 返回判斷類別      """      n_train = x_train.shape[0]      dis = []      for i in range(n_train):          dis.append(np.sum((sample-x_train[i,:])**2))      minIndx = np.argmin(dis)      return y_train[minIndx]

劃分數據集

def train_test_split(x,y,ratio = 3):      """      :function: 對數據集劃分為訓練集、測試集      :param x: m*n維 m表示數據個數 n表示特徵個數      :param y: 標籤      :param ratio: 產生比例 train:test = 3:1(默認比例)      :return: x_train y_train  x_test y_test      """      n_samples , n_train = x.shape[0] , int(x.shape[0]*(ratio)/(1+ratio))      train_id = random.sample(range(0,n_samples),n_train)      x_train = x[train_id,:]      y_train = y[train_id]      x_test = np.delete(x,train_id,axis = 0)      y_test = np.delete(y,train_id,axis = 0)      return x_train,y_train,x_test,y_test

測試代碼

from sklearn import datasets  from Include.chapter3 import function  import numpy as np    #讀取數據  digits = datasets.load_digits()  x , y = digits.data,digits.target    #劃分數據集  x_train, y_train, x_test, y_test = function.train_test_split(x,y)  testId = np.random.randint(0, x_test.shape[0])  sample = x_test[testId, :]    eigVec = function.pca(x_train)  mean = np.mean(x,axis=0).reshape((1,64))  #去均值  x_train = x_train - mean  sample = sample - mean  #降維  x_train = np.dot(x_train,eigVec)  sample =  np.dot(sample,eigVec)  #模板匹配  ans = function.neartemplet(x_train,y_train,sample)  print(ans==y_test[testId])
算法結果
True