Headline

CVE-2021-46339: Assertion 'lit_is_valid_cesu8_string (string_p, string_size)' failed at jerryscript/jerry-core/ecma/base/ecma-helpers-string.c(ecma_new_ecma_string_from_utf8):371. · Issue #4935 · jerryscript-project

There is an Assertion 'lit_is_valid_cesu8_string (string_p, string_size)' failed at /base/ecma-helpers-string.c(ecma_new_ecma_string_from_utf8) in JerryScript 3.0.0.

2 years ago

CVE

Open in Source

#microsoft #ubuntu #linux #js #git

@FlydragonTy

JerryScript revision

Commit: a6ab5e9

Version: v3.0.0

Build platform

Ubuntu 18.04.5 LTS (Linux 4.19.128-microsoft-standard x86_64)

Ubuntu 18.04.5 LTS (Linux 5.4.0-44-generic x86_64)

Build steps

python ./tools/build.py --clean --debug --compile-flag=-fsanitize=address --compile-flag=-m32 --compile-flag=-g --strip=off --lto=off --logging=on --line-info=on --error-message=on --system-allocator=on --stack-limit=20

Test case

poc-as.txt

Execution steps & Output

$ ./jerryscript/build/bin/jerry poc.js

ICE: Assertion 'lit_is_valid_cesu8_string (string_p, string_size)' failed at jerryscript/jerry-core/ecma/base/ecma-helpers-string.c(ecma_new_ecma_string_from_utf8):371. Error: ERR_FAILED_INTERNAL_ASSERTION [1] abort jerry poc.js

Credits: Found by OWL337 team.

@ossy-szeged

@rerobika I think it is not a bug, but a feature. “𞸋” is encoded in UTF-8 as 0xF09EB88B which is invaliid in CESU8. But of course we could raise a user friendly error message instead of assertion.

@dbatyai

The issue is not with the “𞸋” character, all non-BMP characters are converted to cesu8 encoding during parsing.
The problem is that the first character is in the basic multilingual plane and should be encoded using 3 bytes, however it is encoded using 4 bytes in the input. This messes up the conversion logic, which always expects the cesu8 equivalent to be 6 bytes long.

@ossy-szeged

+info, a simple /*𝔽*/ string fails with the same error if we build with tools/build.py --debug --function-to-string=on