作者:马金友, 一名给 MySQL 找 bug 的初级 DBA。
爱可生开源社区出品,原创内容未经授权不得随意使用,转载请联系小编并注明来源。
本文约 4000 字,预计阅读需要 15 分钟。
在这篇文章中我会分享一个在 MySQL 8.0.35 版本中修复的一个宕机 bug,以及怎样通过错误日志、corefile 和 gdb 发现的这个 bug。
1背景介绍
一个客户的数据库(MySQL 8.0.27)随机性的崩溃。
通过错误日志我们可以看到是: 由一个 SELECT 查询导致的 Assertion failure。
query
Query (7f20ec0f4a78): SELECT a FROM t WHERE ('session' MEMBER OF (a) )
assertion failure
2023-12-15T22:57:12.408886Z 8 [ERROR] [MY-013183] [InnoDB] Assertion failure: row0mysql.cc:165:len < 256 * 256 thread 139781058020928
表结构
CREATE TABLE `t` (
`id` int,
`a` json DEFAULT NULL,
KEY `a` ((cast(`a` as char(255) array)))
) ENGINE=InnoDB
PARTITION BY HASH(id)
PARTITIONS 1;
2错误日志的分析
错误日志
2023-12-15T22:57:12.408886Z 8 [ERROR] [MY-013183] [InnoDB] Assertion failure: row0mysql.cc:165:len < 256 * 256 thread 139781058020928
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/8.0/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.
22:57:12 UTC - mysqld got signal 6 ;
Most likely, you have hit a bug, but this error can also be caused by malfunctioning hardware.
Thread pointer: 0x7f20ec000ed0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7f21504f3d00 thread_stack 0x100000
/home/ubuntu/opt/mysql/8.0.27/bin/mysqld(my_print_stacktrace(unsigned char const*, unsigned long)+0x2e) [0x1f7f4ae]
/home/ubuntu/opt/mysql/8.0.27/bin/mysqld(handle_fatal_signal+0x323) [0x1029073]
/lib/x86_64-linux-gnu/libc.so.6(+0x42520) [0x7f2167fee520]
/lib/x86_64-linux-gnu/libc.so.6(pthread_kill+0x12c) [0x7f21680429fc]
/lib/x86_64-linux-gnu/libc.so.6(raise+0x16) [0x7f2167fee476]
/lib/x86_64-linux-gnu/libc.so.6(abort+0xd3) [0x7f2167fd47f3]
/home/ubuntu/opt/mysql/8.0.27/bin/mysqld() [0xd7f2bb]
/home/ubuntu/opt/mysql/8.0.27/bin/mysqld() [0x21702c8]
/home/ubuntu/opt/mysql/8.0.27/bin/mysqld(row_sel_field_store_in_mysql_format_func(unsigned char*, mysql_row_templ_t const*, dict_index_t const*, unsigned char const*, unsigned long)+0x1a0) [0x219bf30]
/home/ubuntu/opt/mysql/8.0.27/bin/mysqld(row_sel_store_mysql_rec(unsigned char*, row_prebuilt_t*, unsigned char const*, dtuple_t const*, bool, dict_index_t const*, dict_index_t const*, unsigned long const*, bool, lob::undo_vers_t*, mem_block_info_t*&)+0x1bc) [0x21a089c]
/home/ubuntu/opt/mysql/8.0.27/bin/mysqld(row_search_mvcc(unsigned char*, page_cur_mode_t, row_prebuilt_t*, unsigned long, unsigned long)+0x37ed) [0x21a86bd]
/home/ubuntu/opt/mysql/8.0.27/bin/mysqld(ha_innobase::index_read(unsigned char*, unsigned char const*, unsigned int, ha_rkey_function)+0x331) [0x20609c1]
/home/ubuntu/opt/mysql/8.0.27/bin/mysqld(ha_innopart::index_read_map_in_part(unsigned int, unsigned char*, unsigned char const*, unsigned long, ha_rkey_function)+0x57) [0x207a7a7]
/home/ubuntu/opt/mysql/8.0.27/bin/mysqld(Partition_helper::handle_unordered_scan_next_partition(unsigned char*)+0x11b) [0xdfef5b]
/home/ubuntu/opt/mysql/8.0.27/bin/mysqld(handler::ha_index_read_map(unsigned char*, unsigned char const*, unsigned long, ha_rkey_function)+0x2e1) [0x11310e1]
/home/ubuntu/opt/mysql/8.0.27/bin/mysqld(RefIterator<false>::Read()+0xc1) [0xea4031]
/home/ubuntu/opt/mysql/8.0.27/bin/mysqld(FilterIterator::Read()+0x14) [0x13ab924]
/home/ubuntu/opt/mysql/8.0.27/bin/mysqld(Query_expression::ExecuteIteratorQuery(THD*)+0x3c3) [0xfaed93]
/home/ubuntu/opt/mysql/8.0.27/bin/mysqld(Query_expression::execute(THD*)+0x2c) [0xfaef7c]
/home/ubuntu/opt/mysql/8.0.27/bin/mysqld(Sql_cmd_dml::execute(THD*)+0x2c5) [0xf3f855]
/home/ubuntu/opt/mysql/8.0.27/bin/mysqld(mysql_execute_command(THD*, bool)+0xac8) [0xee1e08]
/home/ubuntu/opt/mysql/8.0.27/bin/mysqld(dispatch_sql_command(THD*, Parser_state*)+0x403) [0xee68f3]
/home/ubuntu/opt/mysql/8.0.27/bin/mysqld(dispatch_command(THD*, COM_DATA const*, enum_server_command)+0x2089) [0xee8e49]
/home/ubuntu/opt/mysql/8.0.27/bin/mysqld(do_command(THD*)+0x174) [0xee9d04]
/home/ubuntu/opt/mysql/8.0.27/bin/mysqld() [0x1019fa8]
/home/ubuntu/opt/mysql/8.0.27/bin/mysqld() [0x249dcec]
/lib/x86_64-linux-gnu/libc.so.6(+0x94ac3) [0x7f2168040ac3]
/lib/x86_64-linux-gnu/libc.so.6(+0x126a40) [0x7f21680d2a40]
ubuntu
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (7f20ec0f4a78): SELECT a FROM t WHERE ('session' MEMBER OF (a) )
Connection ID (thread ID): 8
Status: NOT_KILLED
① 源码中的 Assertion failure
断言在文件 row0mysql.cc
中第 165 行。
2023-12-15T22:57:12.408886Z 8 [ERROR] [MY-013183] [InnoDB] Assertion failure: row0mysql.cc:165:len < 256 * 256 thread 139781058020928
当我们打开文件 ./innobase/row/row0mysql.cc
,我们可以找到对应的 ut_a(len < 256 * 256);
在函数 row_mysql_store_true_var_len
中。
/** Stores a >= 5.0.3 format true VARCHAR length to dest, in the MySQL row
format.
@return pointer to the data, we skip the 1 or 2 bytes at the start
that are used to store the len */
byte *row_mysql_store_true_var_len(
byte *dest, /*!< in: where to store */
ulint len, /*!< in: length, must fit in two bytes */
ulint lenlen) /*!< in: storage length of len: either 1 or 2 bytes */
{
if (lenlen == 2) {
ut_a(len < 256 * 256);
mach_write_to_2_little_endian(dest, len);
return (dest + 2);
}
ut_a(lenlen == 1);
ut_a(len < 256);
mach_write_to_1(dest, len);
return (dest + 1);
}
疑问:InnoDB 为什么会判断这个 len 要小于 256 * 256 呢?
通过注释我们可以看到,这是和 VARCHAR 长度相关。如果您对 MySQL 相对熟悉的话,应该能联想到 VARCHAR 的最大长度(当然我在分析问题的时候没有想到, 同事提示说应该是长度的问题)。
官网[1] 上有一句话:
The length can be specified as a value from 0 to 65,535. The effective maximum length of a VARCHAR is subject to the maximum row size (65,535 bytes, which is shared among all columns) and the character set used
重点是值的范围是 0 - 65,535,而 256 * 256 是 65,536。
这里可以理解为什么有了这个断言:因为 InnoDB 需要去确认所有的长度必须小于 65,536!如果不是,那么 InnoDB 只能自杀,进而导致宕机。
② 断言对应的宏 ut_a
当 EXPR
在 ut_a
宏中不是 TRUE,那么就调用 ut_dbg_assertion_failed
去自杀。
/** Abort execution if EXPR does not evaluate to nonzero.
@param EXPR assertion expression that should hold */
#define ut_a(EXPR) \
do { \
if (UNIV_UNLIKELY(!(ulint)(EXPR))) { \
ut_dbg_assertion_failed(#EXPR, __FILE__, (ulint)__LINE__); \
} \
} while (0)
在我们这个案例中,len < 256 * 256
为 FALSE。
③ 调用 ut_dbg_assertion_failed
在 ut_dbg_assertion_failed
,InnoDB 会打印一些宕机信息(也就是我们在错误日志中看到的)。在最后的阶段,调用 abort()
去结束进程。
/** Report a failed assertion.
@param[in] expr The failed assertion
@param[in] file Source file containing the assertion
@param[in] line Line number of the assertion */
[[noreturn]] void ut_dbg_assertion_failed(const char *expr, const char *file,
ulint line) {
#if !defined(UNIV_HOTBACKUP) && !defined(UNIV_NO_ERR_MSGS)
ib::error(ER_IB_MSG_1273)
<< "Assertion failure: " << innobase_basename(file) << ":" << line
<< ((expr != nullptr) ? ":" : "") << ((expr != nullptr) ? expr : "")
<< " thread " << to_string(std::this_thread::get_id());
flush_error_log_messages();
#else /* !UNIV_HOTBACKUP && !defined(UNIV_NO_ERR_MSGS) */
auto filename = base_name(file);
if (filename == nullptr) {
filename = "null";
}
fprintf(stderr,
"InnoDB: Assertion failure: %s:" ULINTPF
"%s%s\n"
"InnoDB: thread %s",
filename, line, expr != nullptr ? ":" : "",
expr != nullptr ? expr : "",
to_string(std::this_thread::get_id()).c_str());
#endif /* !UNIV_HOTBACKUP */
fputs(
"InnoDB: We intentionally generate a memory trap.\n"
"InnoDB: Submit a detailed bug report"
" to http://bugs.mysql.com.\n"
"InnoDB: If you get repeated assertion failures"
" or crashes, even\n"
"InnoDB: immediately after the mysqld startup, there may be\n"
"InnoDB: corruption in the InnoDB tablespace. Please refer to\n"
"InnoDB: " REFMAN
"forcing-innodb-recovery.html\n"
"InnoDB: about forcing recovery.\n",
stderr);
fflush(stderr);
fflush(stdout);
/* Call any registered callback function. */
if (assert_callback) {
assert_callback();
}
abort();
}
④ 调用 abort
这是在 man 手册中看到的关于 abort()
的介绍:
NAME
abort - cause abnormal process termination
SYNOPSIS
#include <stdlib.h>
void abort(void);
DESCRIPTION
The abort() function first unblocks the SIGABRT signal, and then raises that signal for the calling process (as though raise(3) was called). This results
in the abnormal termination of the process unless the SIGABRT signal is caught and the signal handler does not return (see longjmp(3)).
If the SIGABRT signal is ignored, or caught by a handler that returns, the abort() function will still terminate the process. It does this by restoring
the default disposition for SIGABRT and then raising the signal for a second time.
SIGABRT
信号对应的值为 6
。
shell> kill -l SIGABRT
6
⑤ signal 6
这也是为什么我们在错误日志中会看到 signal 6。
22:57:12 UTC - mysqld got signal 6 ;
⑥ backtrace
通过查询这个函数名 row_mysql_store_true_var_len
,我们看到有两个地方调用它。
row_sel_field_store_in_mysql_format_func
innobase_col_to_mysql
shell> grep -nr row_mysql_store_true_var_len .
./innobase/row/row0mysql.cc:159:byte *row_mysql_store_true_var_len(
./innobase/row/row0sel.cc:2605: row_mysql_store_true_var_len(dest, len, templ->mysql_length_bytes);
./innobase/handler/handler0alter.cc:1935: row_mysql_store_true_var_len(dest, len, flen - field->key_length());
./innobase/include/row0mysql.h:102:byte *row_mysql_store_true_var_len(
我们可以看到 row_sel_field_store_in_mysql_format_func
在错误日志中:
/lib/x86_64-linux-gnu/libc.so.6(abort+0xd3) [0x7f2167fd47f3]
/home/ubuntu/opt/mysql/8.0.27/bin/mysqld() [0xd7f2bb]
/home/ubuntu/opt/mysql/8.0.27/bin/mysqld() [0x21702c8]
/home/ubuntu/opt/mysql/8.0.27/bin/mysqld(row_sel_field_store_in_mysql_format_func(unsigned char*, mysql_row_templ_t const*, dict_index_t const*, unsigned char const*, unsigned long)+0x1a0) [0x219bf30]
这个函数调用了一个 relative address (0x21702c8)
,命令 addr2line
可以转换这个 relative address (0x21702c8)
到对应的函数名。
shell> addr2line -fie ~/opt/mysql/8.0.27/bin/mysqld 0x21702c8
_Z28row_mysql_store_true_var_lenPhmm
??:?
函数的调用关系为:
row_sel_field_store_in_mysql_format_func
-> row_mysql_store_true_var_len
-> ut_ad (ut_dbg_assertion_failed)
-> abort
3corefile 的分析
因为我们目前通过错误日志拿到的信息是有限的 (无法获取 len 的具体值),我们建议客户打开 corefile。
这里补充一个 MariaDB 关于怎样开启 corefile 的文档[2]。
① gdb 分析 corefile
客户随后将 corefile 发给了我们。
shell> gdb -ex "set print pretty on" -ex "set pagination off" -ex "set debuginfod enabled on" -ex "set print thread-events off" -ex "file /home/ubuntu/opt/mysql/8.0.27/bin/mysqld" -ex "core /home/ubuntu/corefiles/core.8027"
Core was generated by `/home/ubuntu/opt/mysql/8.0.27/bin/mysqld --defaults-file=/home/ubuntu/sandbox'.
Program terminated with signal SIGABRT, Aborted.
#0 __pthread_kill_implementation (no_tid=3, signo=6, threadid=140132373239360) at ./nptl/pthread_kill.c:44
44 ./nptl/pthread_kill.c: No such file or directory.
[Current thread is 1 (Thread 0x7f731c541640 (LWP 1534610))]
客户的 MySQL 是通过二进制包安装的,所以 mysqld 中自带了
debuginfo
。
② 具体的 backtrace
(gdb) bt
#0 __pthread_kill_implementation (no_tid=3, signo=6, threadid=140132373239360) at ./nptl/pthread_kill.c:44
#1 __pthread_kill_esrch (threadid=140132373239360, signo=signo@entry=6) at ./nptl/pthread_kill.c:108
#2 0x0000000001f7e9e7 in my_write_core (sig=sig@entry=6) at ../../mysql-8.0.27/mysys/stacktrace.cc:294
#3 0x000000000102909d in handle_fatal_signal (sig=6) at ../../mysql-8.0.27/sql/signal_handler.cc:171
#4 <signal handler called>
#5 __pthread_kill_implementation (no_tid=0, signo=6, threadid=140132373239360) at ./nptl/pthread_kill.c:44
#6 __pthread_kill_internal (signo=6, threadid=140132373239360) at ./nptl/pthread_kill.c:78
#7 __GI___pthread_kill (threadid=140132373239360, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#8 0x00007f7347bee476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#9 0x00007f7347bd47f3 in __GI_abort () at ./stdlib/abort.c:79
#10 0x0000000000d7f2bb in ut_dbg_assertion_failed (expr=expr@entry=0x279263f "len < 256 * 256", file=file@entry=0x2f55fa0 "../../../mysql-8.0.27/storage/innobase/row/row0mysql.cc", line=line@entry=165) at ../../../mysql-8.0.27/storage/innobase/ut/ut0dbg.cc:99
#11 0x00000000021702c8 in row_mysql_store_true_var_len (dest=<optimised out>, len=len@entry=4294967292, lenlen=<optimised out>) at ../../../mysql-8.0.27/storage/innobase/row/row0mysql.cc:165
#12 0x000000000219bf30 in row_sel_field_store_in_mysql_format_func (dest=<optimised out>, templ=0x7f72c80ecae0, index=<optimised out>, data=0x7f72c8191be8 "\020\034\031\310r\177", len=4294967292) at ../../../mysql-8.0.27/storage/innobase/row/row0sel.cc:2605
#13 0x00000000021a089c in row_sel_store_mysql_rec (mysql_rec=0x7f72c80da818 "\370\001", prebuilt=0x7f72c8194938, rec=0x7f72c817c39f "", vrow=0x7f72c8191b28, rec_clust=true, rec_index=0x7f72c8148878, prebuilt_index=0x7f72c80d93a8, offsets=0x7f731c53d940, clust_templ_for_sec=false, lob_undo=0x7f72c8194b58, blob_heap=@0x7f72c8194a98: 0x7f72c8aa5c30) at ../../../mysql-8.0.27/storage/innobase/row/row0sel.cc:3013
#14 0x00000000021a86bd in row_search_mvcc (buf=buf@entry=0x7f72c80da818 "\370\001", mode=mode@entry=PAGE_CUR_GE, prebuilt=0x7f72c8194938, match_mode=match_mode@entry=1, direction=direction@entry=0) at ../../../mysql-8.0.27/storage/innobase/include/row0mysql.h:875
#15 0x00000000020609c1 in ha_innobase::index_read (this=0x7f72c8189938, buf=0x7f72c80da818 "\370\001", key_ptr=<optimised out>, key_len=<optimised out>, find_flag=<optimised out>) at ../../../mysql-8.0.27/storage/innobase/handler/ha_innodb.cc:10069
#16 0x000000000207a7a7 in handler::index_read_map (find_flag=HA_READ_KEY_EXACT, keypart_map=1, key=0x7f72c816f6c0 "", buf=0x7f72c80da818 "\370\001", this=0x7f72c8189938) at ../../../mysql-8.0.27/sql/handler.h:5078
#17 ha_innopart::index_read_map_in_part (this=0x7f72c8189938, part=0, record=0x7f72c80da818 "\370\001", key=0x7f72c816f6c0 "", keypart_map=1, find_flag=HA_READ_KEY_EXACT) at ../../../mysql-8.0.27/storage/innobase/handler/ha_innopart.cc:1894
#18 0x0000000000dfef5b in Partition_helper::handle_unordered_scan_next_partition (this=0x7f72c818afc8, buf=0x7f72c80da818 "\370\001") at ../../mysql-8.0.27/sql/partitioning/partition_handler.cc:2364
#19 0x00000000011310e1 in handler::ha_index_read_map (this=0x7f72c8189938, buf=0x7f72c80da818 "\370\001", key=0x7f72c816f6c0 "", keypart_map=1, find_flag=find_flag@entry=HA_READ_KEY_EXACT) at ../../mysql-8.0.27/sql/handler.cc:3253
#20 0x0000000000ea4031 in RefIterator<false>::Read (this=0x7f72c8170398) at ../../mysql-8.0.27/sql/row_iterator.h:208
#21 0x00000000013ab924 in FilterIterator::Read (this=0x7f72c81703d8) at ../../mysql-8.0.27/sql/composite_iterators.cc:72
#22 FilterIterator::Read (this=0x7f72c81703d8) at ../../mysql-8.0.27/sql/composite_iterators.cc:70
#23 0x0000000000faed93 in Query_expression::ExecuteIteratorQuery (this=0x7f72c818dd18, thd=0x7f72c8000ed0) at ../../mysql-8.0.27/sql/sql_union.cc:1300
#24 0x0000000000faef7c in Query_expression::execute (this=<optimised out>, thd=0x7f72c8000ed0) at ../../mysql-8.0.27/sql/sql_union.cc:1353
#25 0x0000000000f3f855 in Sql_cmd_dml::execute (this=0x7f72c8190390, thd=0x7f72c8000ed0) at ../../mysql-8.0.27/sql/sql_select.cc:579
#26 0x0000000000ee1e08 in mysql_execute_command (thd=0x7f72c8000ed0, first_level=<optimised out>) at ../../mysql-8.0.27/sql/sql_parse.cc:4505
#27 0x0000000000ee68f3 in dispatch_sql_command (thd=thd@entry=0x7f72c8000ed0, parser_state=parser_state@entry=0x7f731c53ff10) at ../../mysql-8.0.27/sql/sql_parse.cc:5135
#28 0x0000000000ee8e49 in dispatch_command (thd=0x7f72c8000ed0, com_data=<optimised out>, command=COM_QUERY) at ../../mysql-8.0.27/sql/sql_parse.cc:1922
#29 0x0000000000ee9d04 in do_command (thd=thd@entry=0x7f72c8000ed0) at ../../mysql-8.0.27/sql/sql_parse.cc:1343
#30 0x0000000001019fa8 in handle_connection (arg=arg@entry=0x703d4b0) at ../../mysql-8.0.27/sql/conn_handler/connection_handler_per_thread.cc:302
#31 0x000000000249dcec in pfs_spawn_thread (arg=0x71008f0) at ../../../mysql-8.0.27/storage/perfschema/pfs.cc:2946
#32 0x00007f7347c40ac3 in start_thread (arg=<optimised out>) at ./nptl/pthread_create.c:442
#33 0x00007f7347cd2a40 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
我们可以找到 row_mysql_store_true_var_len
在 frame 11:
(gdb) frame 11
#11 0x00000000021702c8 in row_mysql_store_true_var_len (dest=<optimised out>, len=len@entry=4294967292, lenlen=<optimised out>) at ../../../mysql-8.0.27/storage/innoba
se/row/row0mysql.cc:165
165 ../../../mysql-8.0.27/storage/innobase/row/row0mysql.cc: No such file or directory.
③ 正确显示源码
下载源码
shell> git clone https://github.com/mysql/mysql-server.git
shell> cd mysql-server/
shell> git tag --list | grep 8.0.27
mysql-8.0.27
mysql-cluster-8.0.27
shell> git checkout mysql-8.0.27
我们需要切换到 MySQL 8.0.27 这个分支中。
在 gdb 中映射目录
(gdb) set substitute-path ../../../mysql-8.0.27 /home/ubuntu/mysql-server
④ 分析 corefile
len 的值是多少?
在 row_mysql_store_true_var_len
中,len 是 4294967292
,这明显的大于了 限制范围 65,535!
(gdb) frame 11
#11 0x00000000021702c8 in row_mysql_store_true_var_len (dest=<optimised out>, len=len@entry=4294967292, lenlen=<optimised out>) at ../../../mysql-8.0.27/storage/innobase/row/row0mysql.cc:165
165 ut_a(len < 256 * 256);
(gdb) p len
$2 = 4294967292
(gdb) p len < 256 * 256
$3 = false
len 来自哪里?
在 frame 13 中我们看到 len
来自于 dfield->len
,但是 dfield
已经被优化掉了。
(gdb) frame 13
#13 0x00000000021a089c in row_sel_store_mysql_rec (mysql_rec=0x7f72c80da818 "\370\001", prebuilt=0x7f72c8194938, rec=0x7f72c817c39f "", vrow=0x7f72c8191b28, rec_clust=true, rec_index=0x7f72c8148878, prebuilt_index=0x7f72c80d93a8, offsets=0x7f731c53d940, clust_templ_for_sec=false, lob_undo=0x7f72c8194b58, blob_heap=@0x7f72c8194a98: 0x7f72c8aa5c30) at ../../../mysql-8.0.27/storage/innobase/row/row0sel.cc:3013
(gdb) list
3008 (byte)templ->mysql_null_bit_mask;
3009 memcpy(mysql_rec + templ->mysql_col_offset,
3010 (const byte *)prebuilt->default_rec + templ->mysql_col_offset,
3011 templ->mysql_col_len);
3012 } else {
3013 row_sel_field_store_in_mysql_format(
3014 mysql_rec + templ->mysql_col_offset, templ, rec_index,
3015 templ->clust_rec_field_no, (const byte *)dfield->data, dfield->len,
3016 ULINT_UNDEFINED);
3017 if (templ->mysql_null_bit_mask) {
(gdb) p dfield
$7 = <optimised out>
dfield 又在哪里?
原来在 frame 13 的 row_sel_store_mysql_rec
中,我们找到了。
const auto dfield = dtuple_get_nth_v_field(vrow, col->v_pos);
这个 dfield
是在 dtuple_get_nth_v_field
中通过 vrow
和 col->v_pos
计算出来的.。我们能找到 vrow
,但同样 col
被优化掉了。
(gdb) p vrow
$8 = (const dtuple_t *) 0x7f72c8191b28
(gdb) p col
$9 = <optimised out>
col 是多少?
上面的 col
是 the dict_table_get_nth_v_col
通过 rec_index->table
和 templ->clust_rec_field_no
计算得出。
dict_v_col_t *col;
col =
dict_table_get_nth_v_col(rec_index->table, templ->clust_rec_field_no);
我们看下 dict_table_get_nth_v_col
中的具体内容:
/** Gets the nth virtual column of a table.
@param[in] table table
@param[in] pos position of virtual column
@return pointer to virtual column object */
static inline dict_v_col_t *dict_table_get_nth_v_col(const dict_table_t *table,
ulint pos) {
ut_ad(table);
ut_ad(pos < table->n_v_def);
ut_ad(table->magic_n == DICT_TABLE_MAGIC_N);
return (static_cast<dict_v_col_t *>(table->v_cols) + pos);
}
#endif /* UNIV_DEBUG */
col
可以计算得出:
(gdb) p rec_index->table->v_cols + 0
$17 = (dict_v_col_t *) 0x7f72c8148140
算出 dfield
有了 col
我们就可以算出 dfield
:
const auto dfield = dtuple_get_nth_v_field(vrow, col->v_pos);
dtuple_get_nth_v_field
的定义:
/** Gets nth virtual field of a tuple.
@param[in] tuple tuple
@oaran[in] n the nth field to get
@return nth field */
static inline dfield_t *dtuple_get_nth_v_field(const dtuple_t *tuple, ulint n) {
ut_ad(tuple);
ut_ad(n < tuple->n_v_fields);
return (static_cast<dfield_t *>(tuple->v_fields + n));
}
#endif /* UNIV_DEBUG */
dfield
是 0x7f72c8191bd0
。
(gdb) p vrow->v_fields + (rec_index->table->v_cols + 0)->v_pos
$18 = (dfield_t *) 0x7f72c8191bd0
(gdb) p *(vrow->v_fields + (rec_index->table->v_cols + 0)->v_pos)
$19 = {
data = 0x7f72c8191be8,
ext = 0,
spatial_status = 0,
len = 4294967292,
type = {
prtype = 20279311,
mtype = 12,
len = 1020,
mbminmaxlen = 21
}
}
who set the len
to 4294967292
最终要的一部分就要分析为什么 lento
4294967292。
通过搜索 dtuple_get_nth_v_field
, 我们看到相关文件如下:
shell> grep -nr 'dfield = dtuple_get_nth_v_field' .
./row/row0mysql.cc:578: dfield = dtuple_get_nth_v_field(row, n_v_col);
./row/row0row.cc:117: dfield = dtuple_get_nth_v_field(entry, i - entry_len);
./row/row0sel.cc:2988: const auto dfield = dtuple_get_nth_v_field(vrow, col->v_pos);
./row/row0sel.cc:4242: dfield_t *dfield = dtuple_get_nth_v_field(*vrow, vcol->v_pos);
./row/row0upd.cc:917: dfield = dtuple_get_nth_v_field(entry, i);
./row/row0upd.cc:1224: dfield_t *dfield = dtuple_get_nth_v_field(row, vcol->v_pos);
./row/row0upd.cc:1270: dfield = dtuple_get_nth_v_field(row, col_no);
./row/row0upd.cc:1324: dfield = dtuple_get_nth_v_field(row, vcol->v_pos);
./row/row0upd.cc:1508: dfield = dtuple_get_nth_v_field(row, vcol->v_pos);
./row/row0upd.cc:1856: dfield_t *dfield = dtuple_get_nth_v_field(node->row, col_no);
./include/data0data.ic:457: dfield_t *dfield = dtuple_get_nth_v_field(vrow, i);
./include/data0data.ic:470: dfield_t *dfield = dtuple_get_nth_v_field(vrow, i);
./dict/dict0dict.cc:2892: dfield_t *dfield = dtuple_get_nth_v_field(tuple, i);
./trx/trx0rec.cc:2001: dfield = dtuple_get_nth_v_field(*row, vcol->v_pos);
./trx/trx0rec.cc:2729: dfield = dtuple_get_nth_v_field(row, col_no);
通过分析所有的调用 dtuple_get_nth_v_field
,我们最终可以看到一个函数 trx_undo_rec_get_partial_row
在文件 trx/trx0rec.cc
的第 2001 行。
if (is_virtual) {
ptr = trx_undo_read_v_idx(index->table, ptr, first_v_col, &is_undo_log,
&field_no);
first_v_col = false;
if (field_no != ULINT_UNDEFINED) {
vcol = dict_table_get_nth_v_col(index->table, field_no);
col = &vcol->m_col;
col_no = dict_col_get_no(col);
dfield = dtuple_get_nth_v_field(*row, vcol->v_pos);
vcol->m_col.copy_type(dfield_get_type(dfield));
}
}
if ((vcol != nullptr && vcol->m_col.is_multi_value()) ||
trx_undo_rec_is_multi_value(ptr)) {
ut_ad(is_virtual);
ut_ad(vcol != nullptr || field_no == ULINT_UNDEFINED);
ut_ad(dfield != nullptr || field_no == ULINT_UNDEFINED);
ptr = trx_undo_rec_get_multi_value(ptr, dfield, heap);
continue;
} else {
ptr = trx_undo_rec_get_col_val(ptr, &field, &len, &orig_len);
}
trx_undo_rec_get_multi_value
调用 Multi_value_logger::read
去修改了 dfield
。
/** Read from an undo log record of a multi-value virtual column.
@param[in] ptr pointer to remaining part of the undo record
@param[in,out] field stored field, nullptr if the col is no longer
indexed or existing, in the latter case,
this function will only skip the log
@param[in,out] heap memory heap
@return remaining part of undo log record after reading these values */
const byte *trx_undo_rec_get_multi_value(const byte *ptr, dfield_t *field,
mem_heap_t *heap) {
if (field == nullptr) {
return (ptr + Multi_value_logger::read_log_len(ptr));
}
return (Multi_value_logger::read(ptr, field, heap));
}
Multi_value_logger::read
方法最终修改了 defield->len
为 UNIV_MULTI_VALUE_ARRAY_MARKER
。
const byte *Multi_value_logger::read(const byte *ptr, dfield_t *field,
mem_heap_t *heap) {
ut_ad(is_multi_value_log(ptr));
++ptr;
uint32_t total_len = mach_read_from_2(ptr);
const byte *old_ptr = ptr;
ptr += s_log_length_for_null_or_empty;
if (total_len == s_multi_value_null) {
dfield_set_null(field);
return (ptr);
} else if (total_len == s_multi_value_no_index_value) {
dfield_set_data(field, nullptr, UNIV_NO_INDEX_VALUE);
return (ptr);
}
uint32_t num = mach_read_next_compressed(&ptr);
field->data = mem_heap_alloc(heap, sizeof(multi_value_data));
field->len = UNIV_MULTI_VALUE_ARRAY_MARKER;
multi_value_data *multi_val = static_cast<multi_value_data *>(field->data);
multi_val->num_v = num;
multi_val->alloc(num, false, heap);
for (uint32_t i = 0; i < num; ++i) {
uint32_t len = mach_read_next_compressed(&ptr);
multi_val->datap[i] = ptr;
multi_val->data_len[i] = len;
ptr += len;
}
if (ptr < old_ptr + total_len) {
multi_val->alloc_bitset(heap);
multi_val->bitset->copy(ptr, UT_BITS_IN_BYTES(num));
ptr += UT_BITS_IN_BYTES(num);
}
ut_ad(ptr == old_ptr + total_len);
return (ptr);
}
我们可以看到 UNIV_MULTI_VALUE_ARRAY_MARKER
是 4294967292
。
/** The follwoing number as the length marker of a logical field, which
is only used for multi-value field data, means the data itself of the
field is actually an array. */
const uint32_t UNIV_MULTI_VALUE_ARRAY_MARKER = UNIV_NO_INDEX_VALUE - 1;
(gdb) p UNIV_MULTI_VALUE_ARRAY_MARKER
$23 = 4294967292
4故障复现
到目前为止,我们已经清楚了这个 len 为什么是 4294967292
。下面就要去复现这个 bug 导致的故障。
函数 trx_undo_rec_get_multi_value
会从 Undo Log record 中构建多字段虚拟列 Multi-Value Virtual Column。
我们需要想办法让 MySQL 从 Undo Log 中读取数据。
创建表结构并加入数据。
CREATE TABLE `t` (
`id` int,
`a` json DEFAULT NULL,
KEY `a` ((cast(`a` as char(255) array)))
) ENGINE=InnoDB
PARTITION BY HASH(id)
PARTITIONS 1;
insert into t (id, a) values
( 1,
JSON_ARRAY("session","1")
);
在 session1 中保持一个 REPEATABLE-READ 的事务。
--- session 1 ---
set transaction_isolation = 'REPEATABLE-READ';
begin;
select count(*) from t;
在 session 2 中更新数据。
update t set a = JSON_ARRAY('session', '2') where id = 1;
回到 session 1 中查询数据。那么 InnoDB 必定利用 Undo 保证事务的一致性。
SELECT a FROM t WHERE ('session' MEMBER OF (a) );
ERROR 2013 (HY000): Lost connection to MySQL server during query
No connection. Trying to reconnect...
ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/tmp/mysql_sandbox8028.sock' (111)
ERROR:
Can't connect to the server
成功复现!
5反馈 Bug
这个问题已经在 2023 年 7 月份报告给了 Oracle 官方。Oracle 官方回复他们已经修复了这个问题。
Fixed in the upcoming 8.0.35, 8.2.0 under the heading of Bug#35039937 Multi-value virtual columns should not be materialized
在 2023 年 10 月份,MySQL 8.0.35 修复了这个 bug[3]。
参考资料
官方对于 len 的解释: https://dev.mysql.com/doc/refman/8.0/en/char.html
[2]MariaDB 开启 corefile: https://mariadb.com/kb/en/enabling-core-dumps
[3]Bug 的修复: https://github.com/mysql/mysql-server/commit/05c1f53be7cb8d9b527eb28e809284b0722c4106
本文关键字:#MySQL# #虚拟列# #源码#
✨ Github:https://github.com/actiontech/sqle
📚 文档:https://actiontech.github.io/sqle-docs/
💻 官网:https://opensource.actionsky.com/sqle/
👥 微信群:请添加小助手加入 ActionOpenSource
🔗 商业支持:https://www.actionsky.com/sqle
扫描下方的微信扫描小程序码,进行在线咨询预约:
此外,您也可以直接联系我们的商业支持团队获取更多信息,联系方式如下:
400-820-6580 / 13916131869 / 18930110869