提出的方法•The representation that preserves absolute location information of a point cloud in each bin (如图2所示)• Efficient bin encoding function• Two-step search algorithm
3
算法流程
3.1 Scan Context的创建
(1) 与Shape Context的渊源
Scan Context这个算法其实一开始是由Shape Context [2] 所启发的,而Shape Context是把点云的 local Keypoint 附近的点云形状 encode 进一个图像中。Scan Context的不同在于,它不仅仅是count the number of points,而是采用了 maximum height of points in each bin(简单来说,就是取每一个bin中的所有point的z轴最高点的value作为这个bin的value)。
(2) 为什么选择Maximum height?
a. The reason for using the height is to efficiently summarize the vertical shape of surrounding structures.b. In addition, the maximum height says which part of the surrounding structures is visible from the sensor.c. This egocentric visibility has been a well-known concept in the urban design literature for analyzing an identity of a place
(3) Partition a 3D scan
首先,对每一次Scan进行分割:• Nr: number of rings (黄色圈圈)• Ns: number of sectors (浅蓝色/绿色? 的格子)• Lmax: 雷达每一个射线的最远距离• Radial Gap between rings = • Sector弧度 = • 文章中: Nr=20, Ns=60
(4) 给每个Bin进行赋值: Bin Encoding
公式解读:• 就是指the set of points belonging to the bin where the ith ring and jth sector overlapped。• z(⋅) 是指 中一个point P 的Z坐标。• 直接使用最大z坐标值 z(p),作为这个bin的value。
(5) Scan Context Matrix
A scan context I is finally represented as a Nr × Ns matrix as:
3.2 Similarity Score的计算
假设我们得到了一对Scan Context的矩阵,我们要计算他们俩()之间的相似度,文章中采用了columnwise (按列) 的距离计算。 :Query Point Cloud (简言之,我们当前用来query的点云) :Candidate Point Cloud (咱们的“数据库”中储存的用来匹配的candidate点云) :Column j of Query Point Cloud (列向量) :Column j of Candidate Point Cloud (列向量)
在3.2节中我们提到的公式(6)进行最短距离计算时,要先找到最佳旋转 n∗ ,计算量很大,所以在本文中提出了一种” Two-phase Search “,并提出了 Ring key 这个Descriptor(描述子)来进行匹配搜索:Ring key is a rotation-invariant descriptor, which is extracted from a scan context. Each row of a scan context, r, is encoded into a single real value via ring encoding function . The first element of the vector k is from the nearest circle from a sensor, and following elements are from the next rings in order as illustrated in Fig. 4
Figure 4. Ring key示意图 [1]由内而外,一圈一圈的ring key通过对Scan Context Matrix的每一行row r 进行ψ ( ⋅ )的encoding就变成了一个N r 维度的Vector k:
The ring encoding function ψ \psi ψ is a occupancy ratio using L0 norm:
小红薯: 大师兄,这里的r0是什么意思呢?大师兄: 这是L0 norm(范数)的意思,其实L0 norm并不是一个真正的norm,它就是the total number of non-zero elements in a vector 。 比如,(2,0,1,0,9)这个vector的 L0 norm就是3,因为有3个非零数。大师兄: 这样一来,咱们统计每一圈的row中有多少个非零数值,那这就和rotation没啥关系啦(也就是原文中所说的rotation invariance)! 这样就能够达到快速的search。
(2) KD-Tree
• 在得到ring key向量 k 之后,文章用了 k 构建KD Tree。
• 用ring key of the query到这个KD Tree中搜索K个最相似的scan indexex(K是个heuristic number)
Figure 5. SC在ICP初始化中的应用 [1]4.2 ScanContext在全局重定位中的应用在深蓝学院第四章作业中,我们应用了ScanContext在全局重定位中的效果。如果初始化不在原点,且没有全局重定位,效果如下:如果加入ScanContext进行全局重定位,效果如下图:4.3 Future Works 在文章最后,作者提到可以使用更好的bin encoding function (eg., a bin’s segmantic information)来提升性能,目前咱们只是用了一个很简单的max Z(p)来找Z轴高度上的最高点。对于有梦想的读者,也期待你的贡献!参考文献[1] G. Kim and A. Kim, “Scan Context: Egocentric Spatial Descriptor for Place Recognition Within 3D Point Cloud Map,” 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Madrid, 2018, pp. 4802-4809, doi: 10.1109/IROS.2018.8593953.[2] S. Belongie, J. Malik, and J. Puzicha, “Shape matching and object recognition using shape contexts,” IEEE Trans. Pattern Analysis and Machine Intell., vol. 24, no. 4, pp. 509–522, 2002.