Headline
CVE-2017-2896: TALOS-2017-0403 || Cisco Talos Intelligence Group
An exploitable out-of-bounds write vulnerability exists in the xls_mergedCells function of libxls 1.4. . A specially crafted XLS file can cause a memory corruption resulting in remote code execution. An attacker can send malicious XLS file to trigger this vulnerability.
Summary
An exploitable out-of-bounds write vulnerability exists in the xls_mergedCells function of libxls 1.4. A specially crafted XLS file can cause a memory corruption resulting in remote code execution. An attacker can send malicious xls file to trigger this vulnerability.
Tested Versions
libxls 1.4 readxl package 1.0.0 for R (tested using Microsoft R 4.3.1)
Product URLs
http://libxls.sourceforge.net/
CVSSv3 Score
8.8 - CVSS:3.0/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H
CWE
CWE-787: Out-of-bounds Write
Details
libxls is a C library supported on Windows, Mac and Linux which can read Microsoft Excel File Format (XLS) files. The library is used by the readxl package that can be installed in the R programming language. An out-of-bounds write appears in the xls_mergedCells function. Let’s take a look at the vulnerable code:
Line 606 void xls_mergedCells(xlsWorkSheet* pWS,BOF* bof,BYTE* buf)
Line 607 {
Line 608 int count=*((WORD*)buf);
Line 609 int i,c,r;
Line 610 struct MERGEDCELLS* span;
Line 611 verbose("Merged Cells");
Line 612 for (i=0;i<count;i++)
Line 613 {
Line 614 span=(struct MERGEDCELLS*)(buf+(2+i*sizeof(struct MERGEDCELLS)));
Line 615 // printf("Merged Cells: [%i,%i] [%i,%i] \n",span->colf,span->rowf,span->coll,span->rowl);
Line 616 for (r=span->rowf;r<=span->rowl;r++)
Line 617 for (c=span->colf;c<=span->coll;c++)
Line 618 pWS->rows.row[r].cells.cell[c].ishiden=1;
Line 619 pWS->rows.row[span->rowf].cells.cell[span->colf].colspan=(span->coll-span->colf+1);
Line 620 pWS->rows.row[span->rowf].cells.cell[span->colf].rowspan=(span->rowl-span->rowf+1);
Line 621 pWS->rows.row[span->rowf].cells.cell[span->colf].ishiden=0;
Line 622 }
Line 623 }
Important variables and especially their content are: buf and bof which have been read in raw form from a file. We see at line 612 that the count value, which is exactly bof.size, controls a loop. Next further parts of the buf buffer are pointed to by the span variable at line 614. Because the span structure is based on data directly read from file, an attacker not only fully controls the amount of executions of the for loops at lines 616 and 617 but also the offsets during writes to the pWs->rows structure. Using our PoC we can observe the following values during a crash:
Starting program: /home/icewall/bugs/libxls-1.4.0/build/bin/xls2csv ./crashes/49a5608059427ce2f2c479e33c5e3ae4
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7bd1e57 in xls_mergedCells (pWS=0x605830, bof=0x7fffffffdc10, buf=0x607230 "\b") at xls.c:619
619 pWS->rows.row[span->rowf].cells.cell[span->colf].colspan=(span->coll-span->colf+1);
(gdb) p/x *span
$1 = {rowf = 0xabcd, rowl = 0x1122, colf = 0x3344, coll = 0x6655}
(gdb) p/x *pWS->rows.row
$3 = {index = 0x0, fcell = 0x0, lcell = 0x0, height = 0x0, flags = 0x0, xf = 0x0, xfflags = 0x0, cells = {count = 0x0, cell = 0x607280}}
MergedCell record starts at offset : 7ED4Ch And has form : [BOF][SPAN][SPAN]…BOF.size*sizeof(MERGEDCELLS)…[SPAN]
Crash Information
Crash in the Microsoft R platform:
> library(readxl)
> path <- readxl_example("49a5608059427ce2f2c479e33c5e3ae4.xls")
> lapply(excel_sheets(path), read_excel, path = path)
fread: wanted 1 got 0 loc=519168
fread: wanted 1 got 0 loc=519168
fread: wanted 1 got 0 loc=519168
fread: wanted 1 got 0 loc=519168
fread: wanted 1 got 0 loc=519168
fread: wanted 1 got 0 loc=519168
fread: wanted 1 got 0 loc=519168
fread: wanted 1 got 0 loc=519168
fread: wanted 1 got 0 loc=519168
fread: wanted 1 got 0 loc=519168
fread: wanted 1 got 0 loc=519168
*** caught segfault ***
address 0x54, cause 'memory not mapped'
Traceback:
1: .Call("readxl_read_xls_", PACKAGE = "readxl", path, sheet_i, limits, shim, col_names, col_types, na, trim_ws, guess_max)
2: read_fun(path = path, sheet = sheet, limits = limits, shim = shim, col_names = col_names, col_types = col_types, na = na, trim_ws = trim_ws, guess_max = guess_max)
3: tibble::as_tibble(read_fun(path = path, sheet = sheet, limits = limits, shim = shim, col_names = col_names, col_types = col_types, na = na, trim_ws = trim_ws,
guess_max = guess_max), validate = FALSE)
4: tibble::repair_names(tibble::as_tibble(read_fun(path = path, sheet = sheet, limits = limits, shim = shim, col_names = col_names, col_types = col_types, na = na,
trim_ws = trim_ws, guess_max = guess_max), validate = FALSE), prefix = "X", sep = "__")
5: read_excel_(path = path, sheet = sheet, range = range, col_names = col_names, col_types = col_types, na = na, trim_ws = trim_ws, skip = skip, n_max = n_max, guess_max
= guess_max, excel_format(path))
6: FUN(X[[i]], ...)
7: lapply(excel_sheets(path), read_excel, path = path)
directly in libxls lib:
==70269== Memcheck, a memory error detector
==70269== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==70269== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==70269== Command: ./xls2csv ./crashes/49a5608059427ce2f2c479e33c5e3ae4
==70269==
==70269== Invalid write of size 1
==70269== at 0x4C3106F: strcpy (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==70269== by 0x4E42F05: xls_open (xls.c:927)
==70269== by 0x400956: main (xls2csv.c:45)
==70269== Address 0x5425415 is 0 bytes after a block of size 21 alloc'd
==70269== at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==70269== by 0x4E42EE3: xls_open (xls.c:926)
==70269== by 0x400956: main (xls2csv.c:45)
==70269==
==70269== Invalid read of size 8
==70269== at 0x4E41E57: xls_mergedCells (xls.c:619)
==70269== by 0x4E42CBC: xls_parseWorkSheet (xls.c:861)
==70269== by 0x400AEC: main (xls2csv.c:90)
==70269== Address 0x55e8b66 is 1,095,926 bytes inside an unallocated block of size 3,358,064 in arena "client"
==70269==
==70269== Invalid write of size 2
==70269== at 0x4E41E92: xls_mergedCells (xls.c:619)
==70269== by 0x4E42CBC: xls_parseWorkSheet (xls.c:861)
==70269== by 0x400AEC: main (xls2csv.c:90)
==70269== Address 0x7cf7f is not stack'd, malloc'd or (recently) free'd
Timeline
2017-08-29 - Vendor Disclosure
2017-11-14 - Public Release
Discovered by Marcin ‘Icewall’ Noga of Cisco Talos.