5.7.17版本mysqlbinlog實時拉取的二進制日誌不完整的原因分析
問題描述:
同事使用mysqlbinlog工具的–read-from-remote-server –raw選項,從遠程實例實時拉取二進制日誌時,發現得到的二進制日誌文件大小與遠程實例上的源文件大小不相同,並且使用mysqlbinlog解析時會報錯。
測試環境版本信息如下:
MySQL版本:5.7.17 log MySQL Community Server (GPL) 通用tar包安裝
Mysqlbinlog版本:5.7.17 自帶版本,mysqlbinlog Ver 3.4 for linux-glibc2.5 at x86_64
操作系統版本:CentOS Linux release 7.6.1810 (Core)
分析過程:
1、對比拉取到的二進制日誌文件、源文件的大小,發現拉取到的二進制日誌文件比源文件小,並且該文件大小:7315456/4096,剛好可以整除,源庫上進行多次更新後,多次觀察,發現該數值都可以被4096整除。
因為Linux塊大小是4096 Bytes,所以,我先做個假設:mysqlbinlog工具以4096位元組為單位進行寫入
[root@192_168_129_128 tmp]# ll /usr/local/mysql-5.7.17-linux-glibc2.5-x86_64/data/mybinlog.000006 ; ll mybinlog.000006 -rw-r----- 1 mysql mysql 7318723 May 12 16:55 /usr/local/mysql-5.7.17-linux-glibc2.5-x86_64/data/mybinlog.000006 -rw-r----- 1 root root 7315456 May 12 16:55 mybinlog.000006
2、我在8.0.19版本中測試,問題卻不能復現。
3、寫一個簡單的shell腳本,將mysqlbinlog拉取進程放到後台執行,獲取到上一步的pid之後,使用pstack命令和strace命令,分別查看該進程的函數調用和系統調用
#!/bin/bash nohup mysqlbinlog --read-from-remote-server --host=192.168.129.128 --user=dba --password=123456 --raw --to-last-log binlog.000002 --stop-never & pid=`ps -ef | grep mysqlbinlog | grep -v 'grep' | awk '{print $2}'` pstack $pid > pstack.log strace -f -t -p $pid -o strace.log
pstack命令的輸出結果如下,雖然沒有直接的線索,但是後續可以利用這些函數棧名,去源碼中找線索:
#0 0x00007faf58f60a2d in recv () from /lib64/libpthread.so.0 #1 0x000000000045e9f9 in inline_mysql_socket_recv (flags=0, n=5826, buf=0x1d959d1, mysql_socket=..., src_line=123, src_file=0x550918 "/export/home/pb2/build/sb_0-21378219-1480347226.17/mysql-5.7.17/vio/viosocket.c") at /export/home/pb2/build/sb_0-21378219-1480347226.17/mysql-5.7.17/include/mysql/psi/mysql_socket.h:823 #2 vio_read (vio=0x1d90e60, buf=0x1d959d1 "27399199-10542799900-96491182343-85303866742-80196460272-89060617578-51177070778-10421134155;80308169113-21291571753-18876715410-91134905277-85771492482\360j\003", size=5826) at /export/home/pb2/build/sb_0-21378219-1480347226.17/mysql-5.7.17/vio/viosocket.c:123 #3 0x0000000000434873 in net_read_raw_loop (count=5826, net=0x1d8e550) at /export/home/pb2/build/sb_0-21378219-1480347226.17/mysql-5.7.17/sql/net_serv.cc:672 #4 net_read_packet (net=0x1d8e550, complen=0x7fff6d023f88) at /export/home/pb2/build/sb_0-21378219-1480347226.17/mysql-5.7.17/sql/net_serv.cc:859 #5 0x0000000000434a7c in my_net_read (net=0x1d8e550) at /export/home/pb2/build/sb_0-21378219-1480347226.17/mysql-5.7.17/sql/net_serv.cc:899 #6 0x000000000043985b in cli_safe_read_with_ok (mysql=0x1d8e550, parse_ok=0 '\000', is_data_packet=0x0) at /export/home/pb2/build/sb_0-21378219-1480347226.17/mysql-5.7.17/sql-common/client.c:1040 #7 0x0000000000426b36 in dump_remote_log_entries (logname=0x7fff6d0277e9 "binlog.000002", print_event_info=0x7fff6d0250f0) at /export/home/pb2/build/sb_0-21378219-1480347226.17/mysql-5.7.17/client/mysqlbinlog.cc:2664 #8 dump_single_log (logname=0x7fff6d0277e9 "binlog.000002", print_event_info=0x7fff6d0250f0) at /export/home/pb2/build/sb_0-21378219-1480347226.17/mysql-5.7.17/client/mysqlbinlog.cc:2334 #9 dump_multiple_logs (argc=1, argv=<optimized out>) at /export/home/pb2/build/sb_0-21378219-1480347226.17/mysql-5.7.17/client/mysqlbinlog.cc:2374 #10 0x0000000000427821 in main (argc=1, argv=0x1d58468) at /export/home/pb2/build/sb_0-21378219-1480347226.17/mysql-5.7.17/client/mysqlbinlog.cc:3436
strace命令的輸出結果如下:
....................省略若干行.................... 20528 11:24:25 recvfrom(3, "6813690-49945547265-44609719076-"..., 6605, 0, NULL, NULL) = 6605 20528 11:24:25 lseek(4, 0, SEEK_CUR) = 3424256 20528 11:24:25 write(4, "553290165-79173089006-0778194955"..., 4096) = 4096 20528 11:24:25 write(4, "2\360E\t\0\0p\3\0\0w71247594556-464035229"..., 4096) = 4096 20528 11:24:25 recvfrom(3, " \0\0\211\0K\200\266^\20/\201\0\0\37\0\0\0\320d4\0\0\0\21\1\0\0\0\0\0\0"..., 16384, 0, NULL, NULL) = 1523 20528 11:24:25 lseek(4, 0, SEEK_CUR) = 3432448 20528 11:24:25 lseek(4, 0, SEEK_CUR) = 3432448 20528 11:24:25 lseek(4, 0, SEEK_CUR) = 3432448 20528 11:24:25 lseek(4, 0, SEEK_CUR) = 3432448 20528 11:24:25 lseek(4, 0, SEEK_CUR) = 3432448 20528 11:24:25 lseek(4, 0, SEEK_CUR) = 3432448 20528 11:24:25 lseek(4, 0, SEEK_CUR) = 3432448 20528 11:24:25 lseek(4, 0, SEEK_CUR) = 3432448 20528 11:24:25 lseek(4, 0, SEEK_CUR) = 3432448 20528 11:24:25 lseek(4, 0, SEEK_CUR) = 3432448 20528 11:24:25 lseek(4, 0, SEEK_CUR) = 3432448 20528 11:24:25 lseek(4, 0, SEEK_CUR) = 3432448 20528 11:24:25 lseek(4, 0, SEEK_CUR) = 3432448 20528 11:24:25 lseek(4, 0, SEEK_CUR) = 3432448 20528 11:24:25 lseek(4, 0, SEEK_CUR) = 3432448 20528 11:24:25 lseek(4, 0, SEEK_CUR) = 3432448 20528 11:24:25 lseek(4, 0, SEEK_CUR) = 3432448 20528 11:24:25 lseek(4, 0, SEEK_CUR) = 3432448 20528 11:24:25 lseek(4, 0, SEEK_CUR) = 3432448 20528 11:24:25 lseek(4, 0, SEEK_CUR) = 3432448 20528 11:24:25 lseek(4, 0, SEEK_CUR) = 3432448 20528 11:24:25 lseek(4, 0, SEEK_CUR) = 3432448 20528 11:24:25 recvfrom(3,
strace到這裡就沒有後續的輸出了。
其中,recvfrom系統調用,是經socket接收數據;lseek是一個用於改變讀寫一個文件時讀寫指針位置的一個系統調用;write把buf中nbyte寫入文件描述符handle所指的文檔,成功時返回寫的位元組數,錯誤時返回-1.
分析strace日誌中的多條write記錄,發現每次write的寫入位元組數都是4096,看起來與前面的假設吻合。
但是,每達到4096位元組才寫入磁盤,那麼拉取到的binlog幾乎不可能與源庫一致。而mysqlbinlog的help顯示,–read-from-remote-server參數,實質上是調用MySQL Server的BINLOG-DUMP-NON-GTIDS接口,遠程實例應該是實時發送binlog events才對。
-R, --read-from-remote-server Read binary logs from a MySQL server. This is an alias for read-from-remote-master=BINLOG-DUMP-NON-GTIDS.
好吧,從pstack日誌中得到的函數名,我們去源碼中逐個查看,最終與本問題相關的是client/mysqlbinlog.cc中的dump_remote_log_entries()函數,如下:
/** Requests binlog dump from a remote server and prints the events it receives. @param[in,out] print_event_info Parameters and context state determining how to print. @param[in] logname Name of input binlog. @retval ERROR_STOP An error occurred - the program should terminate. @retval OK_CONTINUE No error, the program should continue. @retval OK_STOP No error, but the end of the specified range of events to process has been reached and the program should terminate. */ static Exit_status dump_remote_log_entries(PRINT_EVENT_INFO *print_event_info, const char* logname) { ....................省略若干行.................... for (;;) { const char *error_msg= NULL; Log_event *ev= NULL; Log_event_type type= binary_log::UNKNOWN_EVENT; len= cli_safe_read(mysql, NULL); if (len == packet_error) { error("Got error reading packet from server: %s", mysql_error(mysql)); DBUG_RETURN(ERROR_STOP); } if (len < 8 && net->read_pos[0] == 254) break; // end of data DBUG_PRINT("info",( "len: %lu net->read_pos[5]: %d\n", len, net->read_pos[5])); /* In raw mode We only need the full event details if it is a ROTATE_EVENT or FORMAT_DESCRIPTION_EVENT */ ....................省略若干行.................... if (raw_mode || (type != binary_log::LOAD_EVENT)) { ....................省略若干行.................... if (raw_mode) { DBUG_EXECUTE_IF("simulate_result_file_write_error", DBUG_SET("+d,simulate_fwrite_error");); if (my_fwrite(result_file, net->read_pos + 1 , len - 1, MYF(MY_NABP))) /*可以看到,cli_safe_read()方法讀取到的binlog event,都用調用my_fwrite函數進行寫入,my_fwrite是對fwrite()系統調用的封裝 這裡並沒有以4096位元組為單位寫入,而是讀多少就寫入多少 這就無法解釋了,代碼邏輯顯示每次拿到數據之後,都會寫入磁盤,為什麼實際上卻不是呢?*/ { error("Could not write into log file '%s'", log_file_name); retval= ERROR_STOP; } if (ev) reset_temp_buf_and_delete(ev); } ....................省略若干行.................... if (retval != OK_CONTINUE) DBUG_RETURN(retval); } else { ....................省略若干行.................... old_off+= len-1; } DBUG_RETURN(OK_CONTINUE); }
請看我用中文注釋在上述源碼中的分析。
代碼邏輯沒有問題,那就有可能是BUG了。。。
結論:
通過上述分析,懷疑遇到了BUG,於是去官方的Release Notes中查找,最終在//dev.mysql.com/doc/relnotes/mysql/5.7/en/news-5-7-19.html 中找到如下說明:
Replication:
mysqlbinlog, if invoked with the –raw option, does not flush the output file until the process terminates. But if also invoked with the –stop-never option,
the process never terminates, thus nothing is ever written to the output file. Now the output is flushed after each event. (Bug #24609402)
但是這個BUG在BUG system中找不到,應該是需要有權限的MOS賬號才能查看。