【連載21】Selective Search-3.11

2020 年 3 月 12 日
筆記

公眾號後台回復「python「，立刻領取100本機器學習必備Python電子書

目標檢測

目標檢測的發展歷程大致如下：

Selective Search

對於目標識別任務，比如判斷一張圖片中有沒有車、是什麼車，一般需要解決兩個問題：目標檢測、目標識別。而目標檢測任務中通常需要先通過某種方法做圖像分割，事先得到候選框；直觀的做法是：給定窗口，對整張圖片滑動掃描，結束後改變窗口大小重複上面步驟，缺點很明顯：重複勞動耗費資源、精度和質量不高等等。針對上面的問題，一種解決方案是借鑒啟發式搜索的方法，充分利用人類的先驗知識。J.R.R. Uijlings在《Selective Search for Object Recoginition》提出一種方法：基於數據驅動，與具體類別無關的多種策略融合的啟發式生成方法。圖片包含各種豐富信息，例如：大小、形狀、顏色、紋理、物體重疊關係等，如果只使用一種信息往往不能解決大部分問題，例如：

左邊的兩隻貓可以通過顏色區別而不是通過紋理，右面的變色龍卻只能通過紋理區別而不是顏色。

啟發式生成設計準則

所以概括來說：

能夠捕捉到各種尺度物體，大的、小的、邊界清楚的、邊界模糊的等等；多尺度的例子：

策略多樣性，採用多樣的策略集合共同作用；
計算快速，由於生成候選框只是檢測第一步，所以計算上它決不能成為瓶頸。

Selective Search

基於以上準則設計Selective Search算法：

採用層次分組算法解決尺度問題。引入圖像分割中的自下而上分組思想，由於整個過程是層次的，在將整個圖合併成一個大的區域的過程中會輸出不同尺度的多個子區域。整個過程如下： 1、利用《Efficient Graph-Based Image Segmentation》（基本思想：將圖像中每個像素表示為圖上的一個節點，用於連接不同節點的無向邊都有一個權重，這個權重表示兩個節點之間的不相似度，通過貪心算法利用最小生成樹做圖像分割）生成初始候選區域； 2、採用貪心算法合併區域，計算任意兩個領域的相似度，把達到閾值的合併，再計算新區域和其所有領域的相似度，循環迭代，直到整個圖變成了一個區域，算法如下：

多樣化策略三個方面：使用多種顏色空間、使用多種相似度計算方法、搜索起始區域不固定。 1、顏色空間有很多種：RGB、HSV、Lab等等，不是論文重點； 2、相似度衡量算法，結合了4重策略： ◆ 顏色相似度以RGB為例，使用L1-norm歸一化每個圖像通道的色彩直方圖（bins=25），每個區域被表示為25×3維向量:; 顏色相似度定義為：區域合併後對新的區域計算其色彩直方圖：新區域的大小為： ◆ 紋理相似度使用快速生成的類SIFT特徵，對每個顏色通道在8個方向上應用方差為1的高斯濾波器，對每個顏色通道的每個方向提取bins=10的直方圖，所以整個紋理向量維度為：3×8×10=240，表示為：; 紋理相似度定義為： ◆ 大小相似度該策略希望小的區域能儘早合併，讓合併操作比較平滑，防止出現某個大區域逐步吞併其他小區域的情況。相似度定義為：其中size(im)為圖像包含像素點數目。 ◆ 區域規則度相似度能夠框住合併後的兩個區域的矩形大小越小說明兩個區域的合併越規則，如：區域規則度相似度定義為：

最終相似度為所有策略加權和，文中採用等權方式：

使用Selective Search做目標識別

訓練過程包含：提取候選框、提取特徵、生成正負樣本、訓練模型，圖示如下：

早期圖像特徵提取往往是各種HOG特徵或BoW特徵，現在CNN特徵幾乎一統天下。檢測定位效果評價採用Average Best Overlap（ABO）和Mean Average Best Overlap（MABO）：

其中：為類別標註、為類別下的ground truth，為通過Selective Search生成的候選框。

)

代碼實踐

參見AlpacaDB。

selectivesearch.py

# -*- coding: utf-8 -*-  import skimage.io  import skimage.feature  import skimage.color  import skimage.transform  import skimage.util  import skimage.segmentation  import numpy  # "Selective Search for Object Recognition" by J.R.R. Uijlings et al.  #  #  - Modified version with LBP extractor for texture vectorization  def _generate_segments(im_orig, scale, sigma, min_size):      """          segment smallest regions by the algorithm of Felzenswalb and          Huttenlocher      """      # open the Image      im_mask = skimage.segmentation.felzenszwalb(          skimage.util.img_as_float(im_orig), scale=scale, sigma=sigma,          min_size=min_size)      # merge mask channel to the image as a 4th channel      im_orig = numpy.append(          im_orig, numpy.zeros(im_orig.shape[:2])[:, :, numpy.newaxis], axis=2)      im_orig[:, :, 3] = im_mask      return im_orig  def _sim_colour(r1, r2):      """          calculate the sum of histogram intersection of colour      """      return sum([min(a, b) for a, b in zip(r1["hist_c"], r2["hist_c"])])  def _sim_texture(r1, r2):      """          calculate the sum of histogram intersection of texture      """      return sum([min(a, b) for a, b in zip(r1["hist_t"], r2["hist_t"])])  def _sim_size(r1, r2, imsize):      """          calculate the size similarity over the image      """      return 1.0 - (r1["size"] + r2["size"]) / imsize  def _sim_fill(r1, r2, imsize):      """          calculate the fill similarity over the image      """      bbsize = (          (max(r1["max_x"], r2["max_x"]) - min(r1["min_x"], r2["min_x"]))          * (max(r1["max_y"], r2["max_y"]) - min(r1["min_y"], r2["min_y"]))      )      return 1.0 - (bbsize - r1["size"] - r2["size"]) / imsize  def _calc_sim(r1, r2, imsize):      return (_sim_colour(r1, r2) + _sim_texture(r1, r2)              + _sim_size(r1, r2, imsize) + _sim_fill(r1, r2, imsize))  def _calc_colour_hist(img):      """          calculate colour histogram for each region          the size of output histogram will be BINS * COLOUR_CHANNELS(3)          number of bins is 25 as same as [uijlings_ijcv2013_draft.pdf]          extract HSV      """      BINS = 25      hist = numpy.array([])      for colour_channel in (0, 1, 2):          # extracting one colour channel          c = img[:, colour_channel]          # calculate histogram for each colour and join to the result          hist = numpy.concatenate(              [hist] + [numpy.histogram(c, BINS, (0.0, 255.0))[0]])      # L1 normalize      hist = hist / len(img)      return hist  def _calc_texture_gradient(img):      """          calculate texture gradient for entire image          The original SelectiveSearch algorithm proposed Gaussian derivative          for 8 orientations, but we use LBP instead.          output will be [height(*)][width(*)]      """      ret = numpy.zeros((img.shape[0], img.shape[1], img.shape[2]))      for colour_channel in (0, 1, 2):          ret[:, :, colour_channel] = skimage.feature.local_binary_pattern(              img[:, :, colour_channel], 8, 1.0)      return ret  def _calc_texture_hist(img):      """          calculate texture histogram for each region          calculate the histogram of gradient for each colours          the size of output histogram will be              BINS * ORIENTATIONS * COLOUR_CHANNELS(3)      """      BINS = 10      hist = numpy.array([])      for colour_channel in (0, 1, 2):          # mask by the colour channel          fd = img[:, colour_channel]          # calculate histogram for each orientation and concatenate them all          # and join to the result          hist = numpy.concatenate(              [hist] + [numpy.histogram(fd, BINS, (0.0, 1.0))[0]])      # L1 Normalize      hist = hist / len(img)      return hist  def _extract_regions(img):      R = {}      # get hsv image      hsv = skimage.color.rgb2hsv(img[:, :, :3])      # pass 1: count pixel positions      for y, i in enumerate(img):          for x, (r, g, b, l) in enumerate(i):              # initialize a new region              if l not in R:                  R[l] = {                      "min_x": 0xffff, "min_y": 0xffff,                      "max_x": 0, "max_y": 0, "labels": [l]}              # bounding box              if R[l]["min_x"] > x:                  R[l]["min_x"] = x              if R[l]["min_y"] > y:                  R[l]["min_y"] = y              if R[l]["max_x"] < x:                  R[l]["max_x"] = x              if R[l]["max_y"] < y:                  R[l]["max_y"] = y      # pass 2: calculate texture gradient      tex_grad = _calc_texture_gradient(img)      # pass 3: calculate colour histogram of each region      for k, v in R.items():          # colour histogram          masked_pixels = hsv[:, :, :][img[:, :, 3] == k]          R[k]["size"] = len(masked_pixels / 4)          R[k]["hist_c"] = _calc_colour_hist(masked_pixels)          # texture histogram          R[k]["hist_t"] = _calc_texture_hist(tex_grad[:, :][img[:, :, 3] == k])      return R  def _extract_neighbours(regions):      def intersect(a, b):          if (a["min_x"] < b["min_x"] < a["max_x"]                  and a["min_y"] < b["min_y"] < a["max_y"]) or (              a["min_x"] < b["max_x"] < a["max_x"]                  and a["min_y"] < b["max_y"] < a["max_y"]) or (              a["min_x"] < b["min_x"] < a["max_x"]                  and a["min_y"] < b["max_y"] < a["max_y"]) or (              a["min_x"] < b["max_x"] < a["max_x"]                  and a["min_y"] < b["min_y"] < a["max_y"]):              return True          return False      R = regions.items()      neighbours = []      for cur, a in enumerate(R[:-1]):          for b in R[cur + 1:]:              if intersect(a[1], b[1]):                  neighbours.append((a, b))      return neighbours  def _merge_regions(r1, r2):      new_size = r1["size"] + r2["size"]      rt = {          "min_x": min(r1["min_x"], r2["min_x"]),          "min_y": min(r1["min_y"], r2["min_y"]),          "max_x": max(r1["max_x"], r2["max_x"]),          "max_y": max(r1["max_y"], r2["max_y"]),          "size": new_size,          "hist_c": (              r1["hist_c"] * r1["size"] + r2["hist_c"] * r2["size"]) / new_size,          "hist_t": (              r1["hist_t"] * r1["size"] + r2["hist_t"] * r2["size"]) / new_size,          "labels": r1["labels"] + r2["labels"]      }      return rt  def selective_search(          im_orig, scale=1.0, sigma=0.8, min_size=50):      '''Selective Search      Parameters      ----------          im_orig : ndarray              Input image          scale : int              Free parameter. Higher means larger clusters in felzenszwalb segmentation.          sigma : float              Width of Gaussian kernel for felzenszwalb segmentation.          min_size : int              Minimum component size for felzenszwalb segmentation.      Returns      -------          img : ndarray              image with region label              region label is stored in the 4th value of each pixel [r,g,b,(region)]          regions : array of dict              [                  {                      'rect': (left, top, right, bottom),                      'labels': [...]                  },                  ...              ]      '''      assert im_orig.shape[2] == 3, "3ch image is expected"      # load image and get smallest regions      # region label is stored in the 4th value of each pixel [r,g,b,(region)]      img = _generate_segments(im_orig, scale, sigma, min_size)      if img is None:          return None, {}      imsize = img.shape[0] * img.shape[1]      R = _extract_regions(img)      # extract neighbouring information      neighbours = _extract_neighbours(R)      # calculate initial similarities      S = {}      for (ai, ar), (bi, br) in neighbours:          S[(ai, bi)] = _calc_sim(ar, br, imsize)      # hierarchal search      while S != {}:          # get highest similarity          i, j = sorted(S.items(), cmp=lambda a, b: cmp(a[1], b[1]))[-1][0]          # merge corresponding regions          t = max(R.keys()) + 1.0          R[t] = _merge_regions(R[i], R[j])          # mark similarities for regions to be removed          key_to_delete = []          for k, v in S.items():              if (i in k) or (j in k):                  key_to_delete.append(k)          # remove old similarities of related regions          for k in key_to_delete:              del S[k]          # calculate similarity set with the new region          for k in filter(lambda a: a != (i, j), key_to_delete):              n = k[1] if k[0] in (i, j) else k[0]              S[(t, n)] = _calc_sim(R[t], R[n], imsize)      regions = []      for k, r in R.items():          regions.append({              'rect': (                  r['min_x'], r['min_y'],                  r['max_x'] - r['min_x'], r['max_y'] - r['min_y']),              'size': r['size'],              'labels': r['labels']          })      return img, regions

example.py

# -*- coding: utf-8 -*-  import matplotlib  matplotlib.use("Agg")  import matplotlib.pyplot as plt  import skimage.data  import skimage.io  from skimage.io import use_plugin,imread  import matplotlib.patches as mpatches  from matplotlib.pyplot import savefig  import selectivesearch  def main():      # loading astronaut image      #img = skimage.data.astronaut()      use_plugin('pil')      img = imread('car.jpg', as_grey=False)      # perform selective search      img_lbl, regions = selectivesearch.selective_search(          img, scale=500, sigma=0.9, min_size=10)      candidates = set()      for r in regions:          # excluding same rectangle (with different segments)          if r['rect'] in candidates:              continue          # excluding regions smaller than 2000 pixels          if r['size'] < 2000:              continue          # distorted rects          x, y, w, h = r['rect']          if w / h > 1.2 or h / w > 1.2:              continue          candidates.add(r['rect'])      # draw rectangles on the original image      plt.figure()      fig, ax = plt.subplots(ncols=1, nrows=1, figsize=(6, 6))      ax.imshow(img)      for x, y, w, h in candidates:          print x, y, w, h          rect = mpatches.Rectangle(              (x, y), w, h, fill=False, edgecolor='red', linewidth=1)          ax.add_patch(rect)      #plt.show()      savefig('MyFig.jpg')  if __name__ == "__main__":      main()

car.jpg原圖如下：

結果圖如下：

【連載21】Selective Search-3.11

目標檢測

Selective Search

啟發式生成設計準則

Selective Search

使用Selective Search做目標識別

代碼實踐

VirMach 便宜 VPS

QNews

【連載21】Selective Search-3.11

目標檢測

Selective Search

啟發式生成設計準則

Selective Search

使用Selective Search做目標識別

代碼實踐

分享此文：

Related Posts

靈感來襲，基於Redis的分佈式延遲隊列

簡單五子棋，沒有電腦AI

速讀原著-TCP/IP(TCP定時器示例)

Rust異步淺談

VirMach 便宜 VPS

QNews

熱門搜尋