­

用於移動設備高性能圖像匹配任務的卷積神經網絡壓縮(CS CV)

  • 2020 年 1 月 14 日
  • 筆記

深度神經網絡已經證明了最先進的性能,基於特徵的圖像匹配,通過新的大型和多樣化的數據集的出現。然而,在評估這些模型的計算成本、模型大小和匹配精度權衡方面幾乎沒有工作。本文通過考慮最先進的L2Net體系結構,明確地解決了這些實際約束問題。我們觀察到L2Net體系結構中存在顯著的冗餘,我們通過使用深度可分層和高效的Tucker分解來開發這種冗餘。我們證明,這些方法的組合更有效,但仍然犧牲了高端的準確性。因此,我們提出了卷積-Dept wise-Point wise(CDP)層,它提供了一種在標準卷積和深度可分卷積之間進行插值的方法。有了這個提議的層,我們能夠在L2Net體系結構上實現多達8倍的參數減少,13倍的計算複雜度減少,同時犧牲不到1%的整體精度跨越HPATCH基準。為了進一步演示這種方法的推廣,我們將其應用於Super Point模型。我們表明,CD P層提高了精度,同時使用明顯較少的參數和浮點運算進行推理。

原文:Deep neural networks have demonstrated state-of-the-art performance for feature-based image matching through the advent of new large and diverse datasets. However, there has been little work on evaluating the computational cost, model size, and matching accuracy tradeoffs for these models. This paper explicitly addresses these practical constraints by considering state-of-the-art L2Net architecture. We observe a significant redundancy in the L2Net architecture, which we exploit through the use of depthwise separable layers and an efficient Tucker decomposition. We demonstrate that a combination of these methods is more effective, but still sacrifices the top-end accuracy. We therefore propose the Convolution-Depthwise-Pointwise (CDP) layer, which provides a means of interpolating between the standard and depthwise separable convolutions. With this proposed layer, we are able to achieve up to 8 times reduction in the number of parameters on the L2Net architecture, 13 times reduction in the computational complexity, while sacrificing less than 1% on the overall accuracy across the HPatches benchmarks. To further demonstrate the generalisation of this approach, we apply it to the SuperPoint model. We show that CDP layers improve upon the accuracy while using significantly less parameters and floating-point operations for inference.

原文題目:Compression of convolutional neural networks for high performance imagematching tasks on mobile devices

原文作者:Roy Miles, Krystian Mikolajczyk

原文地址:https://arxiv.org/abs/2001.03102