RecSys19 | FiBiNET:結合特徵重要性和雙線性特徵交互進行 CTR 預估

  • 2019 年 11 月 21 日
  • 筆記

文章作者:沈偉臣

▌簡介

最近忙於工作,沒怎麼看新的論文,今天把之前寫的一點記錄分享一下~

本文主要介紹新浪微博機器學習團隊發表在 RecSys19 上的一項工作:FiBiNET: Combining Feature Importance and Bilinear feature Interaction for Click-Through Rate Prediction

文章指出當前的許多通過特徵組合進行 CTR 預估的工作主要使用特徵向量的內積或哈達瑪積來計算交叉特徵,這種方法忽略了特徵本身的重要程度。提出通過使用 Squeeze-Excitation network ( SENET ) 結構動態學習特徵的重要性以及使用一個雙線性函數來更好的建模交叉特徵。

下面對該模型進行一個簡單的介紹並提供核心程式碼實現以及運行 demo,細節問題請參閱論文。

▌模型結構

1. 整體結構

圖中可以看到相比於我們熟悉的基於深度學習的 CTR 預估模型,主要增加了 SENET Layer 和 Bilinear-Interaction Layer 兩個結構。下面就針對這兩個結構進行簡單的說明。

2. SENET Layer

具體來說,分為3個步驟:

① Squeeze

② Excitation

③ Re-Weight

3. Bilinear-Interaction

可以通過以下三種方式計算得到:

① Field-All Type

② Field-Each Type

③ Filed-Interaction Type

4. Combination Layer

▌實驗結果對比

文章在 criteo 和 avazu 兩個公開數據集上進行了大量的對比實驗,這裡只貼出 FiBiNET 相比於其他模型的一個對比結果,其他實驗細節請參閱論文~

1. 淺層 FiBiNET

2. 深層 FiBiNET

3. 核心程式碼

這邊只簡單貼一下運行部分的程式碼,預處理和構造參數的程式碼請參考 github.com/shenweichen/:

https://github.com/shenweichen/DeepCTR/blob/master/deepctr/layers/interaction.py

4. SENET Layer

Z = tf.reduce_mean(inputs,axis=-1,)    A_1 = tf.nn.relu(self.tensordot([Z,self.W_1]))  A_2 = tf.nn.relu(self.tensordot([A_1,self.W_2]))  V = tf.multiply(inputs,tf.expand_dims(A_2,axis=2))

5. Bilinear Interaction Layer

if self.type == "all":      p = [tf.multiply(tf.tensordot(v_i,self.W,axes=(-1,0)),v_j) for v_i, v_j in itertools.combinations(inputs, 2)]  elif self.type == "each":      p = [tf.multiply(tf.tensordot(inputs[i],self.W_list[i],axes=(-1,0)),inputs[j]) for i, j in itertools.combinations(range(len(inputs)), 2)]  elif self.type =="interaction":      p = [tf.multiply(tf.tensordot(v[0],w,axes=(-1,0)),v[1]) for v,w in zip(itertools.combinations(inputs,2),self.W_list)]

6. 運行用例

首先確保你的python版本為2.7,3.4,3.5或3.6,然後pip install deepctr[cpu]或者pip install deepctr[gpu], 再去下載一下demo 數據:

https://github.com/shenweichen/DeepCTR/blob/master/examples/criteo_sample.txt

然後直接運行下面的程式碼吧!

import pandas as pd  from sklearn.metrics import log_loss, roc_auc_score  from sklearn.model_selection import train_test_split  from sklearn.preprocessing import LabelEncoder, MinMaxScaler    from deepctr.models import FiBiNET  from deepctr.inputs import  SparseFeat, DenseFeat,get_fixlen_feature_names    if __name__ == "__main__":      data = pd.read_csv('./criteo_sample.txt')        sparse_features = ['C' + str(i) for i in range(1, 27)]      dense_features = ['I' + str(i) for i in range(1, 14)]        data[sparse_features] = data[sparse_features].fillna('-1', )      data[dense_features] = data[dense_features].fillna(0, )      target = ['label']        # 1.Label Encoding for sparse features,and do simple Transformation for dense features      for feat in sparse_features:          lbe = LabelEncoder()          data[feat] = lbe.fit_transform(data[feat])      mms = MinMaxScaler(feature_range=(0, 1))      data[dense_features] = mms.fit_transform(data[dense_features])        # 2.count #unique features for each sparse field,and record dense feature field name        fixlen_feature_columns = [SparseFeat(feat, data[feat].nunique())                             for feat in sparse_features] + [DenseFeat(feat, 1,)                            for feat in dense_features]        dnn_feature_columns = fixlen_feature_columns      linear_feature_columns = fixlen_feature_columns        fixlen_feature_names = get_fixlen_feature_names(linear_feature_columns + dnn_feature_columns)        # 3.generate input data for model        train, test = train_test_split(data, test_size=0.2)      train_model_input = [train[name] for name in fixlen_feature_names]        test_model_input = [test[name] for name in fixlen_feature_names]        # 4.Define Model,train,predict and evaluate      model = FiBiNET(linear_feature_columns, dnn_feature_columns, task='binary')      model.compile("adam", "binary_crossentropy",                    metrics=['binary_crossentropy'], )        history = model.fit(train_model_input, train[target].values,                          batch_size=256, epochs=10, verbose=2, validation_split=0.2, )      pred_ans = model.predict(test_model_input, batch_size=256)      print("test LogLoss", round(log_loss(test[target].values, pred_ans), 4))      print("test AUC", round(roc_auc_score(test[target].values, pred_ans),  4))

▌參考文獻

1. FiBiNET: Combining Feature Importance and Bilinear feature Interaction for Click-Through Rate Prediction

https://arxiv.org/pdf/1905.09433.pdf

2. Squeeze-and-Excitation Networks

http://openaccess.thecvf.com/content_cvpr_2018/papers/Hu_Squeeze-and-Excitation_Networks_CVPR_2018_paper.pdf