一次Impala upsert kudu執行緩慢問題排查總結

  • 2020 年 3 月 10 日
  • 筆記

問題背景

BI同學會用ImpalaKudu表上跑一些ETL任務,最近,BI同學反饋一個Kudu表的ETL任務突然變慢,執行時間從原來的不到1分鐘到現在的7分鐘。

解決過程

下文中提到的軟件環境為:

  • Impala 3.2.0-cdh6.2.0 RELEASE
  • Kudu 1.9.0-cdh6.2.0

我們主要從SQL語句執行的操作了解該SQL的複雜度,並閱讀該SQLprofile信息一步步進行排查,找出產生該問題的原因。以下是排查步驟:

1、該ETL任務的SQL語句執行的是一個UPSERT...SELECT操作,大體結構如下:

UPSERT INTO TABLE rtl_ods_test.a  SELECT ...  FROM rtl_ods_test.test1  LEFT JOIN ...

2、接着我們從執行該SQLimpalad節點獲取SQLprofile信息,profile信息的Summary部分如下:

Operator                 #Hosts   Avg Time   Max Time    #Rows  Est. #Rows   Peak Mem  Est. Peak Mem  Detail  --------------------------------------------------------------------------------------------------------------------------------------------  F00:KUDU WRITER               1      6m57s      6m57s                        20.06 MB       20.00 MB  47:HASH JOIN                  1   55.813ms   55.813ms    1.02M          -1   45.76 MB        2.00 GB  LEFT OUTER JOIN, BROADCAST  |--72:EXCHANGE                1  179.577us  179.577us    7.71K          -1    1.00 MB      152.95 KB  BROADCAST  |  F24:EXCHANGE SENDER        1    4.155ms    4.155ms                         2.55 KB              0  |  24:SCAN KUDU               1    7.886ms    7.886ms    7.71K          -1  948.00 KB        1.88 MB  rtl_ods_test.test24 t24  46:HASH JOIN                  1   67.005ms   67.005ms    1.02M          -1   44.93 MB        2.00 GB  LEFT OUTER JOIN, BROADCAST  |--71:EXCHANGE                1  842.112us  842.112us   39.38K          -1   10.38 MB      431.86 KB  BROADCAST  |  F23:EXCHANGE SENDER        1   29.824ms   29.824ms                        920.00 B              0  |  70:AGGREGATE               1   62.423ms   62.423ms   39.38K          -1   45.96 MB      128.00 MB  FINALIZE  |  69:EXCHANGE                1  873.411us  873.411us   39.38K          -1    1.02 MB      431.96 KB  HASH(hi.order_id)  |  F22:EXCHANGE SENDER        1   30.023ms   30.023ms                        920.00 B              0  |  23:AGGREGATE               1  632.151ms  632.151ms   39.38K          -1   46.57 MB      128.00 MB  STREAMING  |  22:SCAN KUDU               1    6.023ms    6.023ms  319.27K          -1   16.95 MB        1.50 MB  rtl_ods_test.test23 hi  45:HASH JOIN                  1   70.527ms   70.527ms    1.02M          -1   44.93 MB        2.00 GB  LEFT OUTER JOIN, BROADCAST  |--68:EXCHANGE                1   77.392us   77.392us    3.51K          -1  480.00 KB      101.97 KB  BROADCAST  |  F21:EXCHANGE SENDER        1    2.941ms    2.941ms                         3.88 KB              0  |  21:SCAN KUDU               1    4.198ms    4.198ms    3.51K          -1  321.00 KB      768.00 KB  rtl_ods_test.test22 t22  44:HASH JOIN                  1   63.624ms   63.624ms    1.02M          -1   46.91 MB        2.00 GB  LEFT OUTER JOIN, BROADCAST  |--67:EXCHANGE                1  783.586us  783.586us   44.05K          -1    3.04 MB       86.98 KB  BROADCAST  |  F20:EXCHANGE SENDER        1   11.177ms   11.177ms                         4.73 KB              0  |  20:SCAN KUDU               1    9.691ms    9.691ms   44.05K          -1    1.22 MB        1.12 MB  rtl_ods_test.test21 t21  43:HASH JOIN                  1   40.688ms   40.688ms  654.69K          -1   44.90 MB        2.00 GB  LEFT OUTER JOIN, BROADCAST  |--66:EXCHANGE                1  886.154us  886.154us   37.54K          -1    1.73 MB       80.98 KB  BROADCAST  |  F19:EXCHANGE SENDER        1    5.768ms    5.768ms                         5.12 KB              0  |  19:SCAN KUDU               1    6.936ms    6.936ms   37.54K          -1    2.11 MB      768.00 KB  rtl_ods_test.test20 t20  42:HASH JOIN                  1   35.599ms   35.599ms  527.37K          -1   44.89 MB        2.00 GB  LEFT OUTER JOIN, BROADCAST  |--65:EXCHANGE                1  648.607us  648.607us   37.54K          -1    1.73 MB       80.98 KB  BROADCAST  |  F18:EXCHANGE SENDER        1    6.722ms    6.722ms                         5.12 KB              0  |  18:SCAN KUDU               1    6.763ms    6.763ms   37.54K          -1    2.27 MB      768.00 KB  rtl_ods_test.test19 t19  41:HASH JOIN                  1   74.309ms   74.309ms  527.37K          -1   44.91 MB        2.00 GB  LEFT OUTER JOIN, BROADCAST  |--64:EXCHANGE                1   16.129us   16.129us        5          -1   16.00 KB      173.95 KB  BROADCAST  |  F17:EXCHANGE SENDER        1  125.453us  125.453us                         2.24 KB              0  |  17:SCAN KUDU               1    4.313ms    4.313ms        5          -1  179.00 KB        2.25 MB  rtl_ods_test.test18 t18  40:HASH JOIN                  1   18.218ms   18.218ms  266.00K          -1   44.05 MB        2.00 GB  LEFT OUTER JOIN, BROADCAST  |--63:EXCHANGE                1   12.178us   12.178us       51          -1   16.00 KB       92.97 KB  BROADCAST  |  F16:EXCHANGE SENDER        1   85.665us   85.665us                         4.41 KB              0  |  16:SCAN KUDU               1    7.677ms    7.677ms       51          -1   95.00 KB        1.12 MB  rtl_ods_test.test17 t17  39:HASH JOIN                  1   10.809ms   10.809ms  132.23K          -1   46.05 MB        2.00 GB  LEFT OUTER JOIN, BROADCAST  |--62:EXCHANGE                1  939.796us  939.796us   44.05K          -1    3.04 MB       89.97 KB  BROADCAST  |  F15:EXCHANGE SENDER        1   10.317ms   10.317ms                         4.57 KB              0  |  15:SCAN KUDU               1    7.449ms    7.449ms   44.05K          -1    2.34 MB        1.50 MB  rtl_ods_test.test16 t16  38:HASH JOIN                  1    8.885ms    8.885ms   73.13K          -1   44.04 MB        2.00 GB  LEFT OUTER JOIN, BROADCAST  |--61:EXCHANGE                1   73.579us   73.579us    2.07K          -1  160.00 KB       83.98 KB  BROADCAST  |  F14:EXCHANGE SENDER        1  783.596us  783.596us                         4.92 KB              0  |  14:SCAN KUDU               1    2.621ms    2.621ms    2.07K          -1  172.00 KB        1.12 MB  rtl_ods_test.test15 t15  37:HASH JOIN                  1    9.465ms    9.465ms   73.13K          -1   43.20 MB        2.00 GB  LEFT OUTER JOIN, BROADCAST  |--60:EXCHANGE                1   65.940us   65.940us    2.07K          -1  160.00 KB       83.98 KB  BROADCAST  |  F13:EXCHANGE SENDER        1  851.380us  851.380us                         4.92 KB              0  |  13:SCAN KUDU               1    4.485ms    4.485ms    2.07K          -1  172.00 KB        1.12 MB  rtl_ods_test.test14 t14  36:HASH JOIN                  1   11.477ms   11.477ms   73.13K          -1   42.37 MB        2.00 GB  LEFT OUTER JOIN, BROADCAST  |--59:EXCHANGE                1  999.398us  999.398us   39.38K          -1    5.12 MB      170.95 KB  BROADCAST  |  F12:EXCHANGE SENDER        1   12.604ms   12.604ms                         2.24 KB              0  |  12:SCAN KUDU               1   13.496ms   13.496ms   39.38K          -1    1.40 MB        1.50 MB  rtl_ods_test.test13 t13  35:HASH JOIN                  1   10.903ms   10.903ms   73.13K          -1   41.54 MB        2.00 GB  LEFT OUTER JOIN, BROADCAST  |--58:EXCHANGE                1  974.475us  974.475us   39.38K          -1    5.12 MB      140.96 KB  BROADCAST  |  F11:EXCHANGE SENDER        1   11.882ms   11.882ms                         2.78 KB              0  |  11:SCAN KUDU               1   12.115ms   12.115ms   39.38K          -1    1.57 MB        3.00 MB  rtl_ods_test.test12 t12  34:HASH JOIN                  1   11.990ms   11.990ms   73.13K          -1   41.53 MB        2.00 GB  LEFT OUTER JOIN, BROADCAST  |--57:EXCHANGE                1  634.765us  634.765us   31.51K          -1    4.12 MB      176.95 KB  BROADCAST  |  F10:EXCHANGE SENDER        1    6.817ms    6.817ms                         2.24 KB              0  |  10:SCAN KUDU               1   18.167ms   18.167ms   31.51K          -1    1.22 MB        2.25 MB  rtl_ods_test.test11 t11  33:HASH JOIN                  1   14.638ms   14.638ms   73.13K          -1  326.70 MB        2.00 GB  LEFT OUTER JOIN, BROADCAST  |--56:EXCHANGE                1   85.997ms   85.997ms    4.28M          -1   14.01 MB       83.98 KB  BROADCAST  |  F09:EXCHANGE SENDER        1  909.434ms  909.434ms                         4.92 KB              0  |  09:SCAN KUDU               1   26.361ms   26.361ms    4.28M          -1    5.12 MB        1.12 MB  rtl_ods_test.test10 t10  32:HASH JOIN                  1  782.440ms  782.440ms   73.13K          -1  428.69 MB        2.00 GB  LEFT OUTER JOIN, BROADCAST  |--55:EXCHANGE                1  163.681ms  163.681ms    8.23M          -1   16.06 MB       59.98 KB  BROADCAST  |  F08:EXCHANGE SENDER        1  723.689ms  723.689ms                         7.52 KB              0  |  08:SCAN KUDU               1   23.223ms   23.223ms    8.23M          -1    1.51 MB      768.00 KB  rtl_ods_test.test9 t9  31:HASH JOIN                  1   11.533ms   11.533ms   60.13K          -1   39.86 MB        2.00 GB  LEFT OUTER JOIN, BROADCAST  |--54:EXCHANGE                1  663.209us  663.209us   37.54K          -1    1.73 MB       80.98 KB  BROADCAST  |  F07:EXCHANGE SENDER        1    7.003ms    7.003ms                         5.12 KB              0  |  07:SCAN KUDU               1    6.692ms    6.692ms   37.54K          -1    2.35 MB      768.00 KB  rtl_ods_test.test8 t8  30:HASH JOIN                  1   13.723ms   13.723ms   39.38K          -1   39.02 MB        2.00 GB  LEFT OUTER JOIN, BROADCAST  |--53:EXCHANGE                1  109.517us  109.517us    3.51K          -1  256.00 KB       89.97 KB  BROADCAST  |  F06:EXCHANGE SENDER        1  994.329us  994.329us                         4.57 KB              0  |  06:SCAN KUDU               1    3.084ms    3.084ms    3.51K          -1  276.00 KB        1.50 MB  rtl_ods_test.test7 t7  29:HASH JOIN                  1    6.471ms    6.471ms   39.38K          -1   38.19 MB        2.00 GB  LEFT OUTER JOIN, BROADCAST  |--52:EXCHANGE                1  112.333us  112.333us    3.51K          -1  256.00 KB       89.97 KB  BROADCAST  |  F05:EXCHANGE SENDER        1    1.071ms    1.071ms                         4.57 KB              0  |  05:SCAN KUDU               1    3.184ms    3.184ms    3.51K          -1  276.00 KB        1.50 MB  rtl_ods_test.test6 t6  28:HASH JOIN                  1    9.355ms    9.355ms   39.38K          -1   37.36 MB        2.00 GB  LEFT OUTER JOIN, BROADCAST  |--51:EXCHANGE                1  812.178us  812.178us   31.34K          -1    7.87 MB      329.90 KB  BROADCAST  |  F04:EXCHANGE SENDER        1    9.485ms    9.485ms                         1.15 KB              0  |  04:SCAN KUDU               1   26.494ms   26.494ms   31.34K          -1    2.00 MB        7.12 MB  rtl_ods_test.test5 t5  27:HASH JOIN                  1    8.991ms    8.991ms   39.38K          -1   36.52 MB        2.00 GB  LEFT OUTER JOIN, BROADCAST  |--50:EXCHANGE                1  941.754us  941.754us   39.38K          -1    5.12 MB      266.92 KB  BROADCAST  |  F03:EXCHANGE SENDER        1    6.270ms    6.270ms                         1.47 KB              0  |  03:SCAN KUDU               1   16.042ms   16.042ms   39.38K          -1  950.00 KB        5.62 MB  rtl_ods_test.test4 t4  26:HASH JOIN                  1    8.908ms    8.908ms   39.38K          -1   35.69 MB        2.00 GB  LEFT OUTER JOIN, BROADCAST  |--49:EXCHANGE                1  783.992us  783.992us   39.38K          -1    5.12 MB      269.92 KB  BROADCAST  |  F02:EXCHANGE SENDER        1   14.110ms   14.110ms                         1.39 KB              0  |  02:SCAN KUDU               1    7.860ms    7.860ms   39.38K          -1    2.59 MB        3.75 MB  rtl_ods_test.test3 t3  25:HASH JOIN                  1   44.403ms   44.403ms   39.38K          -1   34.86 MB        2.00 GB  LEFT OUTER JOIN, BROADCAST  |--48:EXCHANGE                1  980.101us  980.101us   39.38K          -1    5.12 MB      155.95 KB  BROADCAST  |  F01:EXCHANGE SENDER        1    7.181ms    7.181ms                         2.55 KB              0  |  01:SCAN KUDU               1    9.338ms    9.338ms   39.38K          -1    1.39 MB        2.25 MB  rtl_ods_test.test2 t2  00:SCAN KUDU                  1    6.483ms    6.483ms   39.38K          -1   16.66 MB       14.62 MB  rtl_ods_test.test1 t1

Summary部分,我們看到該SQLSCAN KUDUHASH JOIN操作都非常快,時間主要花費在了往Kudu寫數據的操作上,F00:KUDU WRITER的平均時間Avg Time和最大時間Max Time都為6m57s

3、F00Impala profile中一個Fragment的編號,接着我們通過搜索F00關鍵字找到Summary下面執行明細部分的Fragment

Fragment F00:        Instance 424107027ccceb5c:1f03c57400000018 (host=kudu07:22000):(Total: 7m5s, non-child: 0.000ns, % non-child: 0.00%)          Last report received time: 2020-02-10 15:07:54.309          Hdfs split stats (<volume id>:<# splits>/<split lengths>):          Fragment Instance Lifecycle Event Timeline: 7m5s             - Prepare Finished: 6.511ms (6.511ms)             - Open Finished: 7s632ms (7s626ms)             - First Batch Produced: 7s632ms (206.874us)             - First Batch Sent: 7s669ms (36.252ms)             - ExecInternal Finished: 7m5s (6m57s)           - MemoryUsage (8s000ms): 366.28 MB, 1.47 GB, 1.47 GB, 1.47 GB, 1.47 GB, 1.47 GB, 1.47 GB, 1.47 GB, 1.47 GB, 1.47 GB, 1.47 GB, 1.47 GB, 1.47 GB, 1.47 GB, 1.47 GB, 1.47 GB, 1.47 GB, 1.47 GB, 1.47 GB, 1.47 GB, 1.47 GB, 1.47 GB, 1.47 GB, 1.47 GB, 1.47 GB, 1.47 GB, 1.47 GB, 1.47 GB, 1.47 GB, 1.47 GB, 1.47 GB, 1.47 GB, 1.47 GB, 1.47 GB, 1.47 GB, 1.47 GB, 1.47 GB, 1.47 GB, 1.47 GB, 1.47 GB, 1.47 GB, 1.47 GB, 1.47 GB, 1.47 GB, 1.47 GB, 1.47 GB, 1.47 GB, 1.47 GB, 1.47 GB, 1.47 GB, 1.47 GB, 1.46 GB, 1.46 GB           - ThreadUsage (8s000ms): 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1           - AverageThreadTokens: 1.73           - BloomFilterBytes: 0           - ExchangeScanRatio: 0.00           - PeakMemoryUsage: 1.47 GB (1583111129)           - PeakReservation: 1.43 GB (1530920960)           - PeakUsedReservation: 0           - PerHostPeakMemUsage: 1.47 GB (1583111129)           - RowsProduced: 1.02M (1024494)           - TotalNetworkReceiveTime: 1s388ms           - TotalNetworkSendTime: 0.000ns           - TotalStorageWaitTime: 154.348ms           - TotalThreadsInvoluntaryContextSwitches: 995 (995)           - TotalThreadsTotalWallClockTime: 12m15s             - TotalThreadsSysTime: 3s400ms             - TotalThreadsUserTime: 47s997ms           - TotalThreadsVoluntaryContextSwitches: 12.47K (12471)          Buffer pool:             - AllocTime: 0.000ns             - CumulativeAllocationBytes: 0             - CumulativeAllocations: 0 (0)             - PeakReservation: 0             - PeakUnpinnedBytes: 0             - PeakUsedReservation: 0             - ReadIoBytes: 0             - ReadIoOps: 0 (0)             - ReadIoWaitTime: 0.000ns             - ReservationLimit: 0             - SystemAllocTime: 0.000ns             - WriteIoBytes: 0             - WriteIoOps: 0 (0)             - WriteIoWaitTime: 0.000ns          Fragment Instance Lifecycle Timings:             - ExecTime: 6m57s               - ExecTreeExecTime: 514.258ms             - OpenTime: 7s626ms               - ExecTreeOpenTime: 1s763ms             - PrepareTime: 6.340ms               - ExecTreePrepareTime: 3.879ms          KuduTableSink:(Total: 6m57s, non-child: 6m57s, % non-child: 100.00%)             - KuduApplyTimer: 6m27s             - NumRowErrors: 0 (0)             - PeakMemoryUsage: 20.06 MB (21037056)             - RowsProcessedRate: 2.46 K/sec             - TotalNumRows: 1.02M (1024494)          HASH_JOIN_NODE (id=47):(Total: 2s282ms, non-child: 55.813ms, % non-child: 2.23%)            ExecOption: Probe Side Codegen Enabled, Join Build-Side Prepared Asynchronously            Node Lifecycle Event Timeline: 7m5s               - Open Started: 5s867ms (5s867ms)               - Open Finished: 7s631ms (1s763ms)               - First Batch Requested: 7s632ms (1.524ms)               - First Batch Returned: 7s632ms (202.665us)               - Last Batch Returned: 7m5s (6m57s)               - Closed: 7m5s (178.375ms)             - BuildRows: 7.71K (7708)             - BuildTime: 39.969ms             - NumHashTableBuildsSkipped: 0 (0)             - PeakMemoryUsage: 45.76 MB (47980800)             - ProbeRows: 1.02M (1024494)             - ProbeRowsPartitioned: 0 (0)             - ProbeTime: 51.260ms             - RowsReturned: 1.02M (1024494)             - RowsReturnedRate: 448.83 K/sec

從這裡我們可以看到,這個FragmentHASH_JOIN操作共返回結果1024494條(RowsReturned: 1.02M (1024494)),接着Impala開始向Kudu寫入數據,向Kudu寫數據的profile信息如下:

KuduTableSink:(Total: 6m57s, non-child: 6m57s, % non-child: 100.00%)     - KuduApplyTimer: 6m27s     - NumRowErrors: 0 (0)     - PeakMemoryUsage: 20.06 MB (21037056)     - RowsProcessedRate: 2.46 K/sec     - TotalNumRows: 1.02M (1024494)

KuduTableSink中的信息表明,向Kudu寫入數據總花費時間6m57s(Total: 6m57s),內存使用最大值20.06 MB(PeakMemoryUsage: 20.06 MB),每秒處理數據2.46 K/sec(RowsProcessedRate: 2.46 K/sec),總寫入數據為1024494條(TotalNumRows: 1.02M (1024494))

Kudu是通過主鍵來判斷記錄是否重複的,它實現upsert操作的原理主要是通過判斷插入的記錄是否存在主鍵衝突來決定是插入還是更新,當出現主鍵衝突時進行更新操作,若無衝突則進行插入操作。我們將這些信息反饋給BI同學,他們查了SQL後發現這條SQL返回的結果大部分都是重複的,最終導致將結果集upsertKudu時,花費了很長時間。他們對同步任務的相關表調整後,該ETL任務的執行時間恢復正常。

KuduTableSink源碼分析

上面提到的KuduTableSinkImpalaKudu寫數據的一個類,該類的聲明如下:

class KuduTableSink : public DataSink

KuduTableSink類實現了DataSink接口,DataSink接口是Impala為不同sink操作定義的一個抽象接口,比如向HDFS表寫數據操作、 通過網絡發送數據操作、為join操作構建hash tables操作等等。該接口的實現關係如下圖:

KuduTableSink類的成員信息如下:

KuduTableSink定義了我們在上面的Fragment中看到的metric信息:

  • kudu_apply_timer_:花費在Kudu操作上的時間,在正常情況下,Apply()應該是可以忽略不計,因為它在啟用AUTO_FLUSH_BACKGROUND情況下是異步的,在Apply()中花費的大量時間可能表明,Kudu不能像sink寫行那樣快地緩衝和發送行
  • total_rows_:處理的總行數,包括寫入Kudu的行數和有錯誤的行數
  • num_row_errors_:有錯誤的行數
  • rows_processed_rate_sink消費和處理行的速率

除這些metric之外,KuduTableSink定義了操作Kudu的方法,這些方法主要遵循DataSink接口定義的方法實現標準,這些方法的作用如下:

  • virtual Status Prepare(RuntimeState* state, MemTracker* parent_mem_tracker):該方法在Send()、Open()、 Close()方法執行之前調用,主要為KuduTableSink準備上下文運行環境
  • virtual Status Open(RuntimeState* state):該方法在Send()方法之前調用,主要完成連接Kudu操作並創建KuduSession對象
  • virtual Status Send(RuntimeState* state, RowBatch* batch):該方法會被調用多次,主要完成往Kudu表寫數據操作
  • virtual Status FlushFinal(RuntimeState* state):該方法在close()方法之前調用,負責清空緩存中剩餘的數據,將這些數據刷新到Kudu
  • virtual void Close(RuntimeState* state):關閉KuduSession,釋放資源

Kudu提交數據有三種策略,Impala使用的是異步刷新模式向Kudu提交數據,KuduTableSinkOpen()方法中設置KuduSessionFlushModeAUTO_FLUSH_BACKGROUND

KUDU_RETURN_IF_ERROR(session_->SetFlushMode(kudu::client::KuduSession::AUTO_FLUSH_BACKGROUND), "Unable to set flush mode");

在這裡簡要說下三種Kudu提交數據策略的含義:

  • AUTO_FLUSH_SYNC:同步刷新模式。調用KuduSession.apply()方法後,客戶端會等數據刷新到服務器後再返回,這種情況就不能批量插入數據,調用KuduSession.flush()方法不會起任何作用,因為此時緩衝區數據已經被刷新到了服務器
  • AUTO_FLUSH_BACKGROUND:異步刷新模式。調用KuduSession.apply()方法後,客戶端會立即返回,但是寫入將在後台發送,可能與來自同一會話的其他寫入一起進行批處理。如果沒有足夠的緩衝空間,KuduSession.apply()會阻塞直到緩衝空間可用。因為寫入操作是在後台進行的,因此任何錯誤都將存儲在一個會話本地(session-local)緩衝區中,調用countPendingErrors()或者getPendingErrors()可以獲取錯誤相關的信息。注意:這種模式可能會導致數據寫入的時候亂序,這是因為在這種模式下,多個寫操作可以並行發送到服務器
  • MANUAL_FLUSH:手動刷新模式。調用KuduSession.apply()方法後,客戶端會立即返回,但是寫入請求不會被立即發送,需要我們手動調用KuduSession.flush()來發送寫入請求。如果緩衝區超過了配置的大小,會返回錯誤

除刷新方式設置外,還有以下參數會影響客戶端的寫入行為:

  • kudu_mutation_buffer_sizeKudu客戶端緩存操作數據的位元組數,KuduTableSink中定義的默認值為10MB,該參數通過KuduSessionSetMutationBufferSpace()方法設置。可以在impalad的啟動項中自定義kudu_mutation_buffer_size的大小
  • kudu_error_buffer_sizeKuduSession操作異常的buffer大小,KuduTableSink中定義的默認值為10MB,該參數通過KuduSessionSetErrorBufferSpace()方法設置。可以在impalad的啟動項中自定義kudu_error_buffer_size的大小
  • 觸發flush操作的緩存閾值:僅在AUTO_FLUSH_BACKGROUND刷新模式下生效。KuduTableSink中定義的默認值為70%。當緩存大小達到70%的時候,Kudu客戶端開始將緩存的數據發送給相應的tablet服務。Kudu客戶端定義的閾值為50%。該閾值通過KuduSessionSetMutationBufferFlushWatermark()方法設置
  • 每個KuduSession對象的最大緩存數:KuduTableSink將其設置為0,表示無限制。該參數通過KuduSessionSetMutationBufferMaxNum()方法設置

以上參數在KuduTableSinkOpen()方法中的代碼為:

  session_ = client_->NewSession();    session_->SetTimeoutMillis(FLAGS_kudu_operation_timeout_ms);      // KuduSession Set* methods here and below return a status for API compatibility.    // As long as the Kudu client is statically linked, these shouldn't fail and thus these    // calls could also DCHECK status is OK for debug builds (while still returning errors    // for release).    KUDU_RETURN_IF_ERROR(session_->SetFlushMode(        kudu::client::KuduSession::AUTO_FLUSH_BACKGROUND), "Unable to set flush mode");      const int32_t buf_size = FLAGS_kudu_mutation_buffer_size;    if (buf_size < 1024 * 1024) {      return Status(strings::Substitute(          "Invalid kudu_mutation_buffer_size: '$0'. Must be greater than 1MB.", buf_size));    }    KUDU_RETURN_IF_ERROR(session_->SetMutationBufferSpace(buf_size),        "Couldn't set mutation buffer size");      // Configure client memory used for buffering.    // Internally, the Kudu client keeps one or more buffers for writing operations. When a    // single buffer is flushed, it is locked (that space cannot be reused) until all    // operations within it complete, so it is important to have a number of buffers. In    // our testing, we found that allowing a total of 10MB of buffer space to provide good    // results; this is the default.  Then, because of some existing 8MB limits in Kudu, we    // want to have that total space broken up into 7MB buffers (INDIVIDUAL_BUFFER_SIZE).    // The mutation flush watermark is set to flush every INDIVIDUAL_BUFFER_SIZE.    // TODO: simplify/remove this logic when Kudu simplifies the API (KUDU-1808).    int num_buffers = FLAGS_kudu_mutation_buffer_size / INDIVIDUAL_BUFFER_SIZE;    if (num_buffers == 0) num_buffers = 1;    KUDU_RETURN_IF_ERROR(session_->SetMutationBufferFlushWatermark(1.0 / num_buffers),        "Couldn't set mutation buffer watermark");      // No limit on the buffer count since the settings above imply a max number of buffers.    // Note that the Kudu client API has a few too many knobs for configuring the size and    // number of these buffers; there are a few ways to accomplish similar behaviors.    KUDU_RETURN_IF_ERROR(session_->SetMutationBufferMaxNum(0),        "Couldn't set mutation buffer count");      KUDU_RETURN_IF_ERROR(session_->SetErrorBufferSpace(error_buffer_size),        "Failed to set error buffer space");

最後,在Send()FlushFinal()方法執行過程中,會調用CheckForErrors()方法檢查寫入是否有錯誤發生,並記錄錯誤信息和錯誤的行數(num_row_errors_)。

參考資料

  • 源碼鏈接:https://github.com/apache/impala/blob/3.2.0/be/src/exec/kudu-table-sink.cc
  • SessionConfiguration.FlushMode官方文檔
  • Kudu C++ client API