在 eBPF Talk: 耗时 10 个月,修复了又一个 tailcall 的 bug 之后,再下一城,再次修复与 tailcall
有关的一个 BUG:
bpf: Fix updating attached freplace prog to prog_array map[1]
BUG 危害
这个 BUG 会导致内核 panic:
[309049.036402] BUG: kernel NULL pointer dereference, address: 0000000000000004
[309049.036419] #PF: supervisor read access in kernel mode
[309049.036426] #PF: error_code(0x0000) - not-present page
[309049.036432] PGD 0 P4D 0
[309049.036437] Oops: 0000 [#1] PREEMPT SMP NOPTI
[309049.036444] CPU: 2 PID: 788148 Comm: test_progs Not tainted 6.8.0-31-generic #31-Ubuntu
[309049.036465] Hardware name: VMware, Inc. VMware20,1/440BX Desktop Reference Platform, BIOS VMW201.00V.21805430.B64.2305221830 05/22/2023
[309049.036477] RIP: 0010:bpf_prog_map_compatible+0x2a/0x140
[309049.036488] Code: 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 49 89 fe 41 55 41 54 53 44 8b 6e 04 48 89 f3 41 83 fd 1c 75 0c 48 8b 46 38 48 8b 40 70 <44> 8b 68 04 f6 43 03 01 75 1c 48 8b 43 38 44 0f b6 a0 89 00 00 00
[309049.036505] RSP: 0018:ffffb2e080fd7ce0 EFLAGS: 00010246
[309049.036513] RAX: 0000000000000000 RBX: ffffb2e0807c1000 RCX: 0000000000000000
[309049.036521] RDX: 0000000000000000 RSI: ffffb2e0807c1000 RDI: ffff990290259e00
[309049.036528] RBP: ffffb2e080fd7d08 R08: 0000000000000000 R09: 0000000000000000
[309049.036536] R10: 0000000000000000 R11: 0000000000000000 R12: ffff990290259e00
[309049.036543] R13: 000000000000001c R14: ffff990290259e00 R15: ffff99028e29c400
[309049.036551] FS: 00007b82cbc28140(0000) GS:ffff9903b3f00000(0000) knlGS:0000000000000000
[309049.036559] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[309049.036566] CR2: 0000000000000004 CR3: 0000000101286002 CR4: 00000000003706f0
[309049.036573] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[309049.036581] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[309049.036588] Call Trace:
[309049.036592] <TASK>
[309049.036597] ? show_regs+0x6d/0x80
[309049.036604] ? __die+0x24/0x80
[309049.036619] ? page_fault_oops+0x99/0x1b0
[309049.036628] ? do_user_addr_fault+0x2ee/0x6b0
[309049.036634] ? exc_page_fault+0x83/0x1b0
[309049.036641] ? asm_exc_page_fault+0x27/0x30
[309049.036649] ? bpf_prog_map_compatible+0x2a/0x140
[309049.036656] prog_fd_array_get_ptr+0x2c/0x70
[309049.036664] bpf_fd_array_map_update_elem+0x37/0x130
[309049.036671] bpf_map_update_value+0x1d3/0x260
[309049.036677] map_update_elem+0x1fa/0x360
[309049.036683] __sys_bpf+0x54c/0xa10
[309049.036689] __x64_sys_bpf+0x1a/0x30
[309049.036694] x64_sys_call+0x1936/0x25c0
[309049.036700] do_syscall_64+0x7f/0x180
[309049.036706] ? do_syscall_64+0x8c/0x180
[309049.036712] ? do_syscall_64+0x8c/0x180
[309049.036717] ? irqentry_exit+0x43/0x50
[309049.036723] ? common_interrupt+0x54/0xb0
[309049.036729] entry_SYSCALL_64_after_hwframe+0x73/0x7b
BUG 根因分析
由以下 2 个 commit 组合起来导致的:
bpf: Move prog->aux->linked_prog and trampoline into bpf_link on attach[2] since 5.17 kernel. bpf: Resolve to prog->aux->dst_prog->type only for BPF_PROG_TYPE_EXT[3] since 5.18 kernel.
在 commit bpf: Move prog->aux->linked_prog and trampoline into bpf_link on attach
里,freplace prog 在 attach 完成之后,prog->aux->dst_prog = NULL;
;
然而,在 commit bpf: Resolve to prog->aux->dst_prog->type only for BPF_PROG_TYPE_EXT
里,如果 prog->type == BPF_PROG_TYPE_EXT
则把 prog->aux->dst_prog->type
当作当前的 prog 类型。
所以,如果在 freplace prog attach 之后将 freplace 添加到 prog_array map 里,则会导致上面的 panic。
BUG 修复
修复这个 BUG 不难,将 resolve_prog_type()
改成如下样子即可:
static inline enum bpf_prog_type resolve_prog_type(const struct bpf_prog *prog)
{
return (prog->type == BPF_PROG_TYPE_EXT && prog->aux->saved_dst_prog_type) ?
prog->aux->saved_dst_prog_type : prog->type;
}
总结
修复这个 BUG 的 patch 已合并到 6.11 内核。
即,如果在 6.1/6.6/6.8/6.10 等常见内核版本里使用 freplace+tailcall,则需要规避一下这个 BUG:需要在将 freplace 程序添加到 prog_array map 后再进行 attach。
bpf: Fix updating attached freplace prog to prog_array map: https://lore.kernel.org/bpf/20240728114612.48486-1-leon.hwang@linux.dev/
[2]bpf: Move prog->aux->linked_prog and trampoline into bpf_link on attach: https://github.com/kernel-patches/bpf/commit/3aac1ead5eb6b76f
[3]bpf: Resolve to prog->aux->dst_prog->type only for BPF_PROG_TYPE_EXT: github.com/kernel-patches/bpf/commit/4a9c7bbe2ed4