Security
Headlines
HeadlinesLatestCVEs

Headline

Microsoft Windows UTF-8 Buffer Overruns

When Microsoft released UTF-8 support for the -A interfaces of the Windows API, it appears to have introduced buffer overrun conditions.

Packet Storm
#windows#microsoft

Hi @ll,

almost 4 years ago, with Windows 10 1903, after more than a year
beta-testing in insider previews, Microsoft finally released UTF-8
support for the -A interfaces of the Windows API.

  1. https://docs.microsoft.com/en-us/windows/uwp/design/globalizing/use-utf8-code-page#activeCodePage

    | If the ANSI code page is configured for UTF-8, -A APIs typically
    | operate in UTF-8. This model has the benefit of supporting
    | existing code built with -A APIs without any code changes.

    The last claim is but a bloody DANGEROUS lie!
    As shown hereafter, it must read instead:
    “This model has the malefit of causing buffer overruns in existing
    code!”

  2. For 30 years, the documentation of the -A interfaces for file and
    directory management of the Win32 API states:
    “The maximum path name length is 260 characters.”

    See https://msdn.microsoft.com/en-us/library/aa363855.aspx
    “CreateDirectoryA function” for example:

    | For the ANSI version of this function, there is a default string
    | size limit for paths of 248 characters (MAX_PATH - enough room
    | for a 8.3 filename). …

    | The 255 character limit per path segment still applies.

    This constitutes a contractual GUARANTEE for the product behaviour!

  3. The documentation for the file systems supported by Windows says
    too “The maximum length of a file name segment is 255 characters.”
    See https://msdn.microsoft.com/en-us/library/ee681827.aspx
    “File System Functionality Comparison”

  4. With these 2 contractually GUARANTEED preconditions, the following
    code is safe, i.e. not susceptible to a buffer overrun:
    CreateDirectoryA() fails as soon as szPath exceeds the documented
    limit which is less than the buffer size of 260 characters.

    CHAR szANSI[] = "€"; // or one of the following other 122 characters
    // from ANSI code page 1252:
    // ‚ƒ„…†‡ˆ‰Š‹ŒŽ‘’“”•–—˜™š›œžŸ ¡¢£¤¥¦§¨©ª«¬­®¯°±²³´µ¶·¸¹º»¼½¾¿ÀÁÂ
    // ÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿ
    CHAR szPath[MAX_PATH] = "";
    do
    strcat(szPath, szANSI);
    while (CreateDirectoryA(szPath));

  5. With UTF-8 support enabled, the same code now suffers from a
    buffer overrun:

    CHAR szUTF8[] = u"€"; // or “\xE2\x82\xAC”
    CHAR szPath[MAX_PATH] = "";
    do
    strcat(szPath, szUTF8);
    while (CreateDirectoryA(szPath), NULL);

    STRIKE 1!

  6. Given the 2 guarantees from 1) and 2), the following code is
    also safe and not susceptible to a buffer overrun: see
    https://msdn.microsoft.com/en-us/library/aa365740.aspx
    “WIN32_FIND_DATAA structure” and “FindFirstFile function”
    https://msdn.microsoft.com/en-us/library/aa364418.aspx

    wfd.cFileName can NEVER receive a file/directory name longer
    than 255 characters, so the concatenation of C: or .\ (as
    well as C:\ and …\ too) and wfd.cFileName NEVER overruns a
    buffer of MAX_PATH!

    #define PATTERN "C:" // or "C:\" or ".\" or "…\"

    WIN32_FIND_DATAA wfd;
    CHAR szPath[MAX_PATH] = PATTERN;
    HANDLE hFind = FindFirstFileA(szPath, &wfd);
    if (hFind != INVALID_HANDLE_VALUE) {
    do {
    strcat(szPath + strlen(PATTERN) - 1, wfd.cFileName);
    GetFileAttributesA(szPath); // do something with the
    … // found file system objects
    } while (FindNextFile(hFind, &wfd));
    FindClose(hFind);
    }

  7. With UTF-8 support enabled and a file or directory named
    €€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€
    (or other 86 characters from the above 123) present in the
    CWD of drive C:, FindFirstFileA() (and FindNextFileA() too)
    return a string of 86 * 3 = 258 characters in wfd.cFileName
    which causes a buffer overrun in previously safe code!

    STRIKE 2!

  8. The following code enumerates ALL file system objects in a
    (root) directory or network share:

    WIN32_FIND_DATAA wfd;
    HANDLE hFind = FindFirstFileA("\\host\share\*", &wfd);
    if (hFind != INVALID_HANDLE_VALUE) {
    do { // code to process wfd.cFileName omitted
    } while (FindNextFile(hFind, &wfd));
    FindClose(hFind);
    }

  9. With UTF-8 support enabled and a directory or file named
    €€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€
    (i.e. 87 or more of the 123 characters from above) present,
    both FindFirstFile() and FindNextFile() FAIL with the
    previously impossible, NEVER encountered Win32 error code
    234 alias ERROR_MORE_DATA: wfd.cFileName is to short for
    UTF-8 encoded file/directory segment names!

    STRIKE 3!

stay tuned, and far away from UTF-8 in Windows
Stefan Kanthak

PS: for the full story of Microsoft’s EPIC failures with UTF-8
in Windows, see
https://skanthak.homepage.t-online.de/quirks.html#quirk33
https://skanthak.homepage.t-online.de/quirks.html#quirk32
https://skanthak.homepage.t-online.de/quirks.html#quirk31

Packet Storm: Latest News

Grav CMS 1.7.44 Server-Side Template Injection