Headline
CVE-2023-47100: Fix read/write past buffer end: perl-security#140 · Perl/perl5@ff1f9f5
In Perl before 5.38.2, S_parse_uniprop_string in regcomp.c can write to unallocated space because a property name associated with a \p{…} regular expression construct is mishandled. The earlies affected version is 5.30.0.
Commit
Permalink
Browse files
Browse the repository at this point in the history
Fix read/write past buffer end: perl-security#140
A package name may be specified in a \p{…} regular expression construct. If unspecified, “utf8::” is assumed, which is the package all official Unicode properties are in. By specifying a different package, one can create a user-defined property with the same unqualified name as a Unicode one. Such a property is defined by a sub whose name begins with “Is” or "In", and if the sub wishes to refer to an official Unicode property, it must explicitly specify the "utf8::". S_parse_uniprop_string() is used to parse the interior of both \p{} and the user-defined sub lines.
In S_parse_uniprop_string(), it parses the input “name” parameter, creating a modified copy, "lookup_name", malloc’ed with the same size as "name". The modifications are essentially to create a canonicalized version of the input, with such things as extraneous white-space stripped off. I found it convenient to strip off the package specifier "utf8::". To to so, the code simply pretends “lookup_name” begins just after the "utf8::", and adjusts various other values to compensate. However, it missed the adjustment of one required one.
This is only a problem when the property name begins with “perl” and isn’t “perlspace” nor "perlword". All such ones are undocumented internal properties.
What happens in this case is that the input is reparsed with slightly different rules in effect as to what is legal versus illegal. The problem is that “lookup_name” no longer is pointing to its initial value, but “name” is. Thus the space allocated for filling “lookup_name” is now shorter than "name", and as this shortened “lookup_name” is filled by copying suitable portions of "name", the write can be to unallocated space.
The solution is to skip the “utf8::” when reparsing "name". Then both “lookup_name” and “name” are effectively shortened by the same amount, and there is no going off the end.
This commit also does white-space adjustment so that things align vertically for readability.
This can be easily backported to earlier Perl releases.
- Loading branch information