用于移动设备高性能图像匹配任务的卷积神经网络压缩(CS CV)
- 2020 年 1 月 14 日
- 筆記
深度神经网络已经证明了最先进的性能,基于特征的图像匹配,通过新的大型和多样化的数据集的出现。然而,在评估这些模型的计算成本、模型大小和匹配精度权衡方面几乎没有工作。本文通过考虑最先进的L2Net体系结构,明确地解决了这些实际约束问题。我们观察到L2Net体系结构中存在显著的冗余,我们通过使用深度可分层和高效的Tucker分解来开发这种冗余。我们证明,这些方法的组合更有效,但仍然牺牲了高端的准确性。因此,我们提出了卷积-Dept wise-Point wise(CDP)层,它提供了一种在标准卷积和深度可分卷积之间进行插值的方法。有了这个提议的层,我们能够在L2Net体系结构上实现多达8倍的参数减少,13倍的计算复杂度减少,同时牺牲不到1%的整体精度跨越HPATCH基准。为了进一步演示这种方法的推广,我们将其应用于Super Point模型。我们表明,CD P层提高了精度,同时使用明显较少的参数和浮点运算进行推理。
原文:Deep neural networks have demonstrated state-of-the-art performance for feature-based image matching through the advent of new large and diverse datasets. However, there has been little work on evaluating the computational cost, model size, and matching accuracy tradeoffs for these models. This paper explicitly addresses these practical constraints by considering state-of-the-art L2Net architecture. We observe a significant redundancy in the L2Net architecture, which we exploit through the use of depthwise separable layers and an efficient Tucker decomposition. We demonstrate that a combination of these methods is more effective, but still sacrifices the top-end accuracy. We therefore propose the Convolution-Depthwise-Pointwise (CDP) layer, which provides a means of interpolating between the standard and depthwise separable convolutions. With this proposed layer, we are able to achieve up to 8 times reduction in the number of parameters on the L2Net architecture, 13 times reduction in the computational complexity, while sacrificing less than 1% on the overall accuracy across the HPatches benchmarks. To further demonstrate the generalisation of this approach, we apply it to the SuperPoint model. We show that CDP layers improve upon the accuracy while using significantly less parameters and floating-point operations for inference.
原文题目:Compression of convolutional neural networks for high performance imagematching tasks on mobile devices
原文作者:Roy Miles, Krystian Mikolajczyk
原文地址:https://arxiv.org/abs/2001.03102