Security
Headlines
HeadlinesLatestCVEs

Headline

CVE-2012-2135: Issue 14579: CVE-2012-2135: Vulnerability in the utf-16 decoder after error handling

The utf-16 decoder in Python 3.1 through 3.3 does not update the aligned_end variable after calling the unicode_decode_call_errorhandler function, which allows remote attackers to obtain sensitive information (process memory) or cause a denial of service (memory corruption and crash) via unspecified vectors.

CVE
#vulnerability#debian#dos#git#buffer_overflow#auth

process

Status:

closed

Resolution:

fixed

Dependencies:

Superseder:

Assigned To:

Nosy List:

Arfrever, Henri.Salo, Huzaifa.Sidhpurwala, asvetlov, benjamin.peterson, ezio.melotti, georg.brandl, loewis, pitrou, python-dev, serhiy.storchaka, vstinner

Priority:

normal

Keywords:

patch

Created on 2012-04-14 18:46 by serhiy.storchaka, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files

File name

Uploaded

Description

Edit

utf16_update_after_error.patch

serhiy.storchaka, 2012-04-14 18:46

Patch for Python 3.3

review

utf16_update_after_error-3.2.patch

serhiy.storchaka, 2012-04-14 18:54

Patch for Python 3.2

review

utf16crasher.py

serhiy.storchaka, 2012-04-19 20:26

utf16_error_handling-3.2.patch

serhiy.storchaka, 2012-04-20 18:39

Patch for Python 3.2. Сorrecting all bugs.

review

utf16_update_after_error-3.2.patch

serhiy.storchaka, 2012-04-20 18:41

Patch for Python 3.2. Сorrecting only critical bug.

review

utf16_error_handling-3.2_3.patch

serhiy.storchaka, 2012-04-24 21:30

review

utf16_error_handling-3.2_4.patch

serhiy.storchaka, 2012-04-25 17:44

Patch for Python 3.2

review

utf16_error_handling-2.7.patch

serhiy.storchaka, 2012-04-25 17:44

Patch for Python 2.7

review

utf16_error_handling_tests-3.3.patch

serhiy.storchaka, 2012-07-18 05:29

Patch for Python 3.3

review

Messages (26)

msg158272 - (view)

Author: Serhiy Storchaka (serhiy.storchaka) *

Date: 2012-04-14 18:46

In the utf-16 decoder after calling unicode_decode_call_errorhandler aligned_end is not updated. This may potentially cause data leaks, memory damage, and crash. The bug introduced by implementation of the issue #4868. In a similar situation in the utf-8 decoder aligned_end is updated.

msg158743 - (view)

Author: Serhiy Storchaka (serhiy.storchaka) *

Date: 2012-04-19 20:26

There is the crasher and leaker. When Python is not crashing, there is garbage (i.e. leakage of data) at the end of the decoded string. Indeed, I see an English text in some versions of Python.

There are many other errors in utf-16 decoder (see, for example, b’\xD8\x00\xDC’.decode(‘utf-16be’)). I’m now finishing work on a new decoder, and after that take the bug fixing in 3.2.

msg158758 - (view)

Author: Serhiy Storchaka (serhiy.storchaka) *

Date: 2012-04-19 21:29

Here is the bugs in the utf-16 decoder:

  1. `aligned_end` is not updated after calling error handler.

  2. Possible silent reading of one byte over the bytes array limit when decoding of a surrogate pair. b’\xD8\x00\xDC’.decode(‘utf-16be’)

  3. Error handlers receive data without last byte.

  4. After handling truncate data error it is impossible to continue decoding (unlike all the other decoders).

msg158760 - (view)

Author: Serhiy Storchaka (serhiy.storchaka) *

Date: 2012-04-19 21:35

The proposed patch will fix only the first of these bugs. The patch in issue #14624 fixes all bugs for Python 3.3. For Python 3.2 soon I will make a patch.

msg158810 - (view)

Author: Martin v. Löwis (loewis) *

Date: 2012-04-20 09:53

[moving from Rietveld back to Roundup]

On 2012/04/20 11:15:48, storchaka wrote:

The `aligned_end` may point outside unicode object, if the unicode object was reallocated.

How so? The aligned_end *never* points into the unicode object:

q = (unsigned char *)s; e = q + size - 1; aligned_end = (const unsigned char *) ((size_t) e & ~LONG_PTR_MASK);

So aligned_end points into s, not into the unicode object. So this adjustment is necessary because the *input* may change in the callback, not because the output may change. So the comment in decode_utf8_errors seems just as wrong.

Why this is relevant to this issue, is unclear to me, though: the ignore handler doesn’t modify the input object.

msg158818 - (view)

Author: Serhiy Storchaka (serhiy.storchaka) *

Date: 2012-04-20 11:43

> So this adjustment is necessary because the *input* may change in the callback,

not because the output may change. So the comment in decode_utf8_errors seems just as wrong.

You’re right, and my eyes in a lather. Now I saw it.

What you have to offer any comment? If someone would correct a comment for decode_utf8_errors, I just copied it.

Why this is relevant to this issue, is unclear to me, though: the ignore handler doesn’t modify the input object.

I first got the crash using a custom handler, and then I saw that “ignore” handler is enough. Even if the “ignore” handler does not have to change the input object, other handlers can do it and this is the reason for the crash remains.

msg158821 - (view)

Author: Martin v. Löwis (loewis) *

Date: 2012-04-20 11:55

> You’re right, and my eyes in a lather. Now I saw it.

What you have to offer any comment? If someone would correct a comment for decode_utf8_errors, I just copied it.

“might have changed the input object”

Why this is relevant to this issue, is unclear to me, though: the ignore handler doesn’t modify the input object.

I first got the crash using a custom handler, and then I saw that “ignore” handler is enough. Even if the “ignore” handler does not have to change the input object, other handlers can do it and this is the reason for the crash remains.

I agree that the change is necessary. It just does not explain why it fixes this issue.

msg158869 - (view)

Author: Serhiy Storchaka (serhiy.storchaka) *

Date: 2012-04-20 18:39

Here is a minimal patch that corrects all bugs for 3.2. As a side effect, decoding is accelerated by 4-8%.

msg159135 - (view)

Author: Martin v. Löwis (loewis) *

Date: 2012-04-24 12:07

Now I see the problem: make_decode_exception creates a new bytes object in any case, regardless of whether the error handler will update it or not. Therefore, decoding will continue in this new bytes object.

I think the same issue also applies to the ASCII decoder in 3.3.

msg159138 - (view)

Author: Serhiy Storchaka (serhiy.storchaka) *

Date: 2012-04-24 12:35

> I think the same issue also applies to the ASCII decoder in 3.3.

No, the ASCII decoder is not affected by this vulnerability. In a loop, in which unicode_decode_call_errorhandler is called, do not use any cached and not-updatable data.

msg159212 - (view)

Author: Serhiy Storchaka (serhiy.storchaka) *

Date: 2012-04-24 21:30

Here is a patch, which took into account the Martin suggestions.

msg159257 - (view)

Author: Kurt Seifried ([email protected])

Date: 2012-04-25 06:38

Please use CVE-2012-2135 for this issue as per http://www.openwall.com/lists/oss-security/2012/04/25/3

msg159258 - (view)

Author: Huzaifa Sidhpurwala (Huzaifa.Sidhpurwala)

Date: 2012-04-25 06:46

I have not tried the patch yet, but modifying the reproducer yields a different crash. This one seems to be a heap-based buffer overflow which is slightly more serious.

In the reproducer, you just need to replace ascii() with str().

Again works on python3 only.

msg159266 - (view)

Author: Serhiy Storchaka (serhiy.storchaka) *

Date: 2012-04-25 10:30

I now write tests and I have a question. Should b’\xd8\x00\x41’.decode('utf-16be’, ‘replace’) to give ‘\xfffd’ or '\xfffd\xfffd’?

msg159317 - (view)

Author: Henri Salo (Henri.Salo)

Date: 2012-04-25 16:43

Debian bug-report: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=670389 Found in versions python3-defaults/3.2.3~rc1-2, python3-defaults/3.1.3-12+squeeze1

msg159320 - (view)

Author: Henri Salo (Henri.Salo)

Date: 2012-04-25 16:59

I tested versions 3.1.1, 3.1.2, 3.1.3, 3.1.4 and 3.1.5 and only 3.1.3 crashed with Segmentation fault:

Program received signal SIGSEGV, Segmentation fault. 0x00000000004c483a in PyObject_Call (func=0x7ffff7e4d3b0, arg=0x7ffff70fd410, kw=0x0) at Objects/abstract.c:2156 2156 if ((call = func->ob_type->tp_call) != NULL) {

(gdb) bt #0 0x00000000004c483a in PyObject_Call (func=0x7ffff7e4d3b0, arg=0x7ffff70fd410, kw=0x0) at Objects/abstract.c:2156 #1 0x000000000045c437 in do_call (f=0x8929b0, throwflag=<value optimized out>) at Python/ceval.c:3982 #2 call_function (f=0x8929b0, throwflag=<value optimized out>) at Python/ceval.c:3785 #3 PyEval_EvalFrameEx (f=0x8929b0, throwflag=<value optimized out>) at Python/ceval.c:2548 #4 0x000000000045e675 in PyEval_EvalCodeEx (co=0x7ffff7159e30, globals=<value optimized out>, locals=<value optimized out>, args=0x0, argcount=1, kws=<value optimized out>, kwcount=0, defs=0x0, defcount=0, kwdefs=0x0, closure=0x0) at Python/ceval.c:3198 #5 0x000000000045e77b in PyEval_EvalCode (co=0x7ffff7e4d3b0, globals=0x7ffff70fd410, locals=0x0) at Python/ceval.c:668 #6 0x00000000004800b2 in run_mod (fp=<value optimized out>, filename=<value optimized out>, flags=0x7fffffffe390) at Python/pythonrun.c:1711 #7 PyRun_InteractiveOneFlags (fp=<value optimized out>, filename=<value optimized out>, flags=0x7fffffffe390) at Python/pythonrun.c:1104 #8 0x00000000004803ce in PyRun_InteractiveLoopFlags (fp=0x7ffff75346a0, filename=0x5312a1 "<stdin>", flags=0x7fffffffe390) at Python/pythonrun.c:1006 #9 0x0000000000480bab in PyRun_AnyFileExFlags (fp=0x7ffff75346a0, filename=0x5312a1 "<stdin>", closeit=0, flags=0x7fffffffe390) at Python/pythonrun.c:975 #10 0x0000000000496422 in Py_Main (argc=<value optimized out>, argv=<value optimized out>) at Modules/main.c:607 #11 0x0000000000416e6e in main (argc=<value optimized out>, argv=<value optimized out>) at ./Modules/python.c:152

msg159325 - (view)

Author: Serhiy Storchaka (serhiy.storchaka) *

Date: 2012-04-25 17:44

I thought it was one error, and not two.

The updated patch adds tests and fixes minor mistake. 2.7 is not affected by main security issue, but it contains one of mentioned bugs (read 1 byte outside of the input array). A patch for 2.7 fixes this bug and also includes tests.

msg159362 - (view)

Author: STINNER Victor (vstinner) *

Date: 2012-04-26 11:36

I ran tests of utf16_error_handling-3.2_4.patch on Python 3.1. Two tests are failing:

  • b’\x00\xd8’.decode(‘utf-16le’, ‘replace’)=’\ufffd\ufffd’ != ‘\ufffd’
  • b’\xd8\x00’.decode(‘utf-16be’, ‘replace’)=’\ufffd\ufffd’ != ‘\ufffd’

I don’t think that the test is correct: UTF-16 should resynchronize as early as possible (ignore the first invalid byte and restart at the following byte), so ‘\ufffd\ufffd’ is the correct answer.

Another examples:

  • b’\xd8\x00\x41’.decode(‘utf-16be’, ‘replace’) should return ‘�A’ (\ufffdA’)
  • with UTF-8 decoder: (b’\xC3’ + '\xe9’.encode(‘utf-8’)).decode('utf-8’, ‘replace’) returns ‘\ufffd\xe9’

msg159363 - (view)

Author: Antoine Pitrou (pitrou) *

Date: 2012-04-26 11:54

> I ran tests of utf16_error_handling-3.2_4.patch on Python 3.1. Two tests are failing:

  • b’\x00\xd8’.decode(‘utf-16le’, ‘replace’)=’\ufffd\ufffd’ != ‘\ufffd’
  • b’\xd8\x00’.decode(‘utf-16be’, ‘replace’)=’\ufffd\ufffd’ != ‘\ufffd’

I don’t think that the test is correct: UTF-16 should resynchronize as early as possible (ignore the first invalid byte and restart at the following byte), so ‘\ufffd\ufffd’ is the correct answer.

UTF-16 units are 16-bit words, not bytes, so ‘\uffffd’ sounds correct to me. You resynchronize on the word boundary: the invalid word is skipped.

  • with UTF-8 decoder: (b’\xC3’ + '\xe9’.encode(‘utf-8’)).decode('utf-8’, ‘replace’) returns ‘\ufffd \xe9’

That’s because UTF-8 operates on bytes: the invalid byte is skipped.

msg159411 - (view)

Author: Martin v. Löwis (loewis) *

Date: 2012-04-26 18:46

> UTF-16 units are 16-bit words, not bytes, so ‘\uffffd’ sounds correct to

me. You resynchronize on the word boundary: the invalid word is skipped.

I agree. The only odd case is when the number of bytes is not even (pun intended). In that case, anybody can guess which of the bytes is extra. The most natural (IMO) assumption is that the data is truncated, so it would be the last byte which is extra.

msg165743 - (view)

Author: Serhiy Storchaka (serhiy.storchaka) *

Date: 2012-07-18 05:29

Please, can anyone do a final review and commit?

Here are three patches for three Python versions:

2.7: utf16_error_handling-2.7.patch. Fix for one minor bug (overreading) and tests.

3.2: utf16_error_handling-3.2_4.patch. Fix for one critical security bug (CVE-2012-2135) and several minor bugs, tests.

3.3: utf16_error_handling-3.3.patch. Only tests.

msg165984 - (view)

Author: Antoine Pitrou (pitrou) *

Date: 2012-07-20 22:41

There are spurious print() calls in the 2.7 patch.

msg165985 - (view)

Author: Roundup Robot (python-dev)

Date: 2012-07-20 22:50

New changeset 034ff986019d by Antoine Pitrou in branch '3.2’: Issue #14579: Fix CVE-2012-2135: vulnerability in the utf-16 decoder after error handling. http://hg.python.org/cpython/rev/034ff986019d

New changeset 118fe0ee6921 by Antoine Pitrou in branch 'default’: Port additional tests from #14579 (the issue is already fixed). http://hg.python.org/cpython/rev/118fe0ee6921

msg165986 - (view)

Author: Roundup Robot (python-dev)

Date: 2012-07-20 22:54

New changeset 4cadf91aaddd by Antoine Pitrou in branch '2.7’: Issue #14579: Fix error handling bug in the utf-16 decoder. http://hg.python.org/cpython/rev/4cadf91aaddd

msg165987 - (view)

Author: Antoine Pitrou (pitrou) *

Date: 2012-07-20 22:54

Thanks for the patches, Serhiy! They’re now pushed.

msg165994 - (view)

Author: Serhiy Storchaka (serhiy.storchaka) *

Date: 2012-07-21 06:16

> There are spurious print() calls in the 2.7 patch.

Oh, my inattentiveness. Thank you for pushing, Antoine. And thank Martin for review.

History

Date

User

Action

Args

2022-04-11 14:57:29

admin

set

github: 58784

2012-07-21 06:16:09

serhiy.storchaka

set

messages: + msg165994

2012-07-20 22:54:34

pitrou

set

status: open -> closed
resolution: fixed
messages: + msg165987

stage: commit review -> resolved

2012-07-20 22:54:03

python-dev

set

messages: + msg165986

2012-07-20 22:50:12

python-dev

set

nosy: + python-dev
messages: + msg165985

2012-07-20 22:45:57

pitrou

set

stage: test needed -> commit review

2012-07-20 22:41:05

pitrou

set

messages: + msg165984

2012-07-18 05:33:32

serhiy.storchaka

set

components: + Interpreter Core, Unicode
versions: + Python 2.7, - Python 3.1

2012-07-18 05:29:23

serhiy.storchaka

set

files: + utf16_error_handling_tests-3.3.patch

messages: + msg165743

2012-04-27 06:25:20

georg.brandl

set

nosy: + georg.brandl

2012-04-26 18:46:13

loewis

set

messages: + msg159411

2012-04-26 11:54:12

pitrou

set

messages: + msg159363

2012-04-26 11:36:25

vstinner

set

messages: + msg159362

2012-04-25 18:33:01

loewis

set

title: Vulnerability in the utf-16 decoder after error handling -> CVE-2012-2135: Vulnerability in the utf-16 decoder after error handling

2012-04-25 17:44:49

serhiy.storchaka

set

files: + utf16_error_handling-3.2_4.patch, utf16_error_handling-2.7.patch

messages: + msg159325

2012-04-25 16:59:07

Henri.Salo

set

messages: + msg159320
versions: + Python 3.1

2012-04-25 16:58:03

pitrou

set

nosy: + benjamin.peterson

2012-04-25 16:43:33

Henri.Salo

set

nosy: + Henri.Salo
messages: + msg159317

2012-04-25 15:39:50

[email protected]

set

nosy: - [email protected]

2012-04-25 10:30:31

serhiy.storchaka

set

messages: + msg159266

2012-04-25 06:47:00

Huzaifa.Sidhpurwala

set

nosy: + Huzaifa.Sidhpurwala
messages: + msg159258

2012-04-25 06:38:31

[email protected]

set

nosy: + [email protected]
messages: + msg159257

2012-04-24 21:33:36

serhiy.storchaka

set

title: Possible vulnerability in the utf-16 decoder after error handling -> Vulnerability in the utf-16 decoder after error handling

2012-04-24 21:30:43

serhiy.storchaka

set

files: + utf16_error_handling-3.2_3.patch

messages: + msg159212
title: Vulnerability in the utf-16 decoder after error handling -> Possible vulnerability in the utf-16 decoder after error handling

2012-04-24 12:35:06

serhiy.storchaka

set

messages: + msg159138

2012-04-24 12:07:21

loewis

set

messages: + msg159135

2012-04-20 21:38:37

asvetlov

set

nosy: + asvetlov

2012-04-20 18:41:07

serhiy.storchaka

set

files: + utf16_update_after_error-3.2.patch

2012-04-20 18:40:30

serhiy.storchaka

set

files: - utf16_error_handling-3.2.patch

2012-04-20 18:39:06

serhiy.storchaka

set

files: + utf16_error_handling-3.2.patch

messages: + msg158869

2012-04-20 18:36:49

serhiy.storchaka

set

files: + utf16_error_handling-3.2.patch

2012-04-20 11:55:57

loewis

set

messages: + msg158821

2012-04-20 11:43:34

serhiy.storchaka

set

messages: + msg158818

2012-04-20 09:53:22

loewis

set

nosy: + loewis
messages: + msg158810

2012-04-20 06:38:39

Arfrever

set

nosy: + Arfrever

2012-04-19 21:35:57

serhiy.storchaka

set

messages: + msg158760

2012-04-19 21:29:38

serhiy.storchaka

set

messages: + msg158758
title: Possible vulnerability in the utf-16 decoder after error handling -> Vulnerability in the utf-16 decoder after error handling

2012-04-19 20:26:08

serhiy.storchaka

set

files: + utf16crasher.py

messages: + msg158743

2012-04-15 21:58:07

vstinner

set

nosy: + vstinner

2012-04-14 18:54:16

serhiy.storchaka

set

files: + utf16_update_after_error-3.2.patch

2012-04-14 18:47:26

ezio.melotti

set

nosy: + pitrou, ezio.melotti

stage: test needed

2012-04-14 18:46:02

serhiy.storchaka

create

CVE: Latest News

CVE-2023-50976: Transactions API Authorization by oleiman · Pull Request #14969 · redpanda-data/redpanda
CVE-2023-6905
CVE-2023-6903
CVE-2023-6904
CVE-2023-3907