Headline
CVE-2023-42183: A Post-Unicode Normalization Vulnerability
lockss-daemon (aka Classic LOCKSS Daemon) before 1.77.3 performs post-Unicode normalization, which may allow bypass of intended access restrictions, such as when U+1FEF is converted to a backtick.
Summary
The next code snippet is vulnerable to post-Unicode normalization. It’s a CWE-176.
Such a vulnerability happens when some security checks are performed before a Unicode normalization.
/\*\*
\* Sanitises a string so that it can be used as a div id
\*
\* @param name
\* @return Returns sanitized string
\*/
public static String cleanName(String name) {
return Normalizer.normalize(HtmlUtil.encode(name.replace(" ", "\_").replace("&", "").replace("(", "")
.replace(")", "").replace(",", "").replace("+", "\_"), HtmlUtil.ENCODE\_TEXT), Normalizer.Form.NFC);
}
As can be seen the function cleanName() sanitizes the name against spaces, ampersand and (),+ characters.
However, the late Unicode normalization using the NFC form algorithm may re-introduce back those characters.
Impact
This is a low-severity vulnerability. A mitigation would be to Unicode normalize first and then omit (replace) the unwanted characters.
As an example of a re-introduced characters check when the normalization operation is applied to U+1FEF (`), the resulting character will be U+0060 (`) under the NFC form. Same could happen to other cases.
References
- https://sim4n6.beehiiv.com/p/unicode-characters-bypass-security-checks