­

RecSys19 | FiBiNET:结合特征重要性和双线性特征交互进行 CTR 预估

  • 2019 年 11 月 21 日
  • 笔记

文章作者:沈伟臣

▌简介

最近忙于工作,没怎么看新的论文,今天把之前写的一点记录分享一下~

本文主要介绍新浪微博机器学习团队发表在 RecSys19 上的一项工作:FiBiNET: Combining Feature Importance and Bilinear feature Interaction for Click-Through Rate Prediction

文章指出当前的许多通过特征组合进行 CTR 预估的工作主要使用特征向量的内积或哈达玛积来计算交叉特征,这种方法忽略了特征本身的重要程度。提出通过使用 Squeeze-Excitation network ( SENET ) 结构动态学习特征的重要性以及使用一个双线性函数来更好的建模交叉特征。

下面对该模型进行一个简单的介绍并提供核心代码实现以及运行 demo,细节问题请参阅论文。

▌模型结构

1. 整体结构

图中可以看到相比于我们熟悉的基于深度学习的 CTR 预估模型,主要增加了 SENET Layer 和 Bilinear-Interaction Layer 两个结构。下面就针对这两个结构进行简单的说明。

2. SENET Layer

具体来说,分为3个步骤:

① Squeeze

② Excitation

③ Re-Weight

3. Bilinear-Interaction

可以通过以下三种方式计算得到:

① Field-All Type

② Field-Each Type

③ Filed-Interaction Type

4. Combination Layer

▌实验结果对比

文章在 criteo 和 avazu 两个公开数据集上进行了大量的对比实验,这里只贴出 FiBiNET 相比于其他模型的一个对比结果,其他实验细节请参阅论文~

1. 浅层 FiBiNET

2. 深层 FiBiNET

3. 核心代码

这边只简单贴一下运行部分的代码,预处理和构造参数的代码请参考 github.com/shenweichen/:

https://github.com/shenweichen/DeepCTR/blob/master/deepctr/layers/interaction.py

4. SENET Layer

Z = tf.reduce_mean(inputs,axis=-1,)    A_1 = tf.nn.relu(self.tensordot([Z,self.W_1]))  A_2 = tf.nn.relu(self.tensordot([A_1,self.W_2]))  V = tf.multiply(inputs,tf.expand_dims(A_2,axis=2))

5. Bilinear Interaction Layer

if self.type == "all":      p = [tf.multiply(tf.tensordot(v_i,self.W,axes=(-1,0)),v_j) for v_i, v_j in itertools.combinations(inputs, 2)]  elif self.type == "each":      p = [tf.multiply(tf.tensordot(inputs[i],self.W_list[i],axes=(-1,0)),inputs[j]) for i, j in itertools.combinations(range(len(inputs)), 2)]  elif self.type =="interaction":      p = [tf.multiply(tf.tensordot(v[0],w,axes=(-1,0)),v[1]) for v,w in zip(itertools.combinations(inputs,2),self.W_list)]

6. 运行用例

首先确保你的python版本为2.7,3.4,3.5或3.6,然后pip install deepctr[cpu]或者pip install deepctr[gpu], 再去下载一下demo 数据:

https://github.com/shenweichen/DeepCTR/blob/master/examples/criteo_sample.txt

然后直接运行下面的代码吧!

import pandas as pd  from sklearn.metrics import log_loss, roc_auc_score  from sklearn.model_selection import train_test_split  from sklearn.preprocessing import LabelEncoder, MinMaxScaler    from deepctr.models import FiBiNET  from deepctr.inputs import  SparseFeat, DenseFeat,get_fixlen_feature_names    if __name__ == "__main__":      data = pd.read_csv('./criteo_sample.txt')        sparse_features = ['C' + str(i) for i in range(1, 27)]      dense_features = ['I' + str(i) for i in range(1, 14)]        data[sparse_features] = data[sparse_features].fillna('-1', )      data[dense_features] = data[dense_features].fillna(0, )      target = ['label']        # 1.Label Encoding for sparse features,and do simple Transformation for dense features      for feat in sparse_features:          lbe = LabelEncoder()          data[feat] = lbe.fit_transform(data[feat])      mms = MinMaxScaler(feature_range=(0, 1))      data[dense_features] = mms.fit_transform(data[dense_features])        # 2.count #unique features for each sparse field,and record dense feature field name        fixlen_feature_columns = [SparseFeat(feat, data[feat].nunique())                             for feat in sparse_features] + [DenseFeat(feat, 1,)                            for feat in dense_features]        dnn_feature_columns = fixlen_feature_columns      linear_feature_columns = fixlen_feature_columns        fixlen_feature_names = get_fixlen_feature_names(linear_feature_columns + dnn_feature_columns)        # 3.generate input data for model        train, test = train_test_split(data, test_size=0.2)      train_model_input = [train[name] for name in fixlen_feature_names]        test_model_input = [test[name] for name in fixlen_feature_names]        # 4.Define Model,train,predict and evaluate      model = FiBiNET(linear_feature_columns, dnn_feature_columns, task='binary')      model.compile("adam", "binary_crossentropy",                    metrics=['binary_crossentropy'], )        history = model.fit(train_model_input, train[target].values,                          batch_size=256, epochs=10, verbose=2, validation_split=0.2, )      pred_ans = model.predict(test_model_input, batch_size=256)      print("test LogLoss", round(log_loss(test[target].values, pred_ans), 4))      print("test AUC", round(roc_auc_score(test[target].values, pred_ans),  4))

▌参考文献

1. FiBiNET: Combining Feature Importance and Bilinear feature Interaction for Click-Through Rate Prediction

https://arxiv.org/pdf/1905.09433.pdf

2. Squeeze-and-Excitation Networks

http://openaccess.thecvf.com/content_cvpr_2018/papers/Hu_Squeeze-and-Excitation_Networks_CVPR_2018_paper.pdf