Headline
GHSA-hc2f-7r5r-r2hg: Heap buffer overflow due to incorrect hash function
Impact
The TensorKey
hash function used total estimated AllocatedBytes()
, which (a) is an estimate per tensor, and (b) is a very poor hash function for constants (e.g. int32_t
). It also tried to access individual tensor bytes through tensor.data()
of size AllocatedBytes()
. This led to ASAN failures because the AllocatedBytes()
is an estimate of total bytes allocated by a tensor, including any pointed-to constructs (e.g. strings), and does not refer to contiguous bytes in the .data()
buffer. We couldn’t use this byte vector anyways, since types like tstring
include pointers, whereas we need to hash the string values themselves.
Patches
We have patched the issue in GitHub commit 1b85a28d395dc91f4d22b5f9e1e9a22e92ccecd6.
The fix will be included in TensorFlow 2.9.0. We will also cherrypick this commit on TensorFlow 2.8.1, which is the only other affected version.
For more information
Please consult our security guide for more information regarding the security model and how to contact us with issues and questions.
Impact
The TensorKey hash function used total estimated AllocatedBytes(), which (a) is an estimate per tensor, and (b) is a very poor hash function for constants (e.g. int32_t). It also tried to access individual tensor bytes through tensor.data() of size AllocatedBytes(). This led to ASAN failures because the AllocatedBytes() is an estimate of total bytes allocated by a tensor, including any pointed-to constructs (e.g. strings), and does not refer to contiguous bytes in the .data() buffer. We couldn’t use this byte vector anyways, since types like tstring include pointers, whereas we need to hash the string values themselves.
Patches
We have patched the issue in GitHub commit 1b85a28d395dc91f4d22b5f9e1e9a22e92ccecd6.
The fix will be included in TensorFlow 2.9.0. We will also cherrypick this commit on TensorFlow 2.8.1, which is the only other affected version.
For more information
Please consult our security guide for more information regarding the security model and how to contact us with issues and questions.
References
- GHSA-hc2f-7r5r-r2hg
- https://nvd.nist.gov/vuln/detail/CVE-2022-29210
- tensorflow/tensorflow@1b85a28
- https://github.com/tensorflow/tensorflow/blob/f3b9bf4c3c0597563b289c0512e98d4ce81f886e/tensorflow/core/framework/tensor_key.h#L53-L64
- https://github.com/tensorflow/tensorflow/releases/tag/v2.8.1
- https://github.com/tensorflow/tensorflow/releases/tag/v2.9.0
Related news
TensorFlow is an open source platform for machine learning. In version 2.8.0, the `TensorKey` hash function used total estimated `AllocatedBytes()`, which (a) is an estimate per tensor, and (b) is a very poor hash function for constants (e.g. `int32_t`). It also tried to access individual tensor bytes through `tensor.data()` of size `AllocatedBytes()`. This led to ASAN failures because the `AllocatedBytes()` is an estimate of total bytes allocated by a tensor, including any pointed-to constructs (e.g. strings), and does not refer to contiguous bytes in the `.data()` buffer. The discoverers could not use this byte vector anyway because types such as `tstring` include pointers, whereas they needed to hash the string values themselves. This issue is patched in Tensorflow versions 2.9.0 and 2.8.1.