Headline
CVE-2023-45287: proposal: math/big: support for constant-time arithmetic · Issue #20654 · golang/go
Before Go 1.20, the RSA based TLS key exchanges used the math/big library, which is not constant time. RSA blinding was applied to prevent timing attacks, but analysis shows this may not have been fully effective. In particular it appears as if the removal of PKCS#1 padding may leak timing information, which in turn could be used to recover session key bits. In Go 1.20, the crypto/tls library switched to a fully constant time RSA implementation, which we do not believe exhibits any timing side channels.
Problem: Constant-Time Arithmetic for Cryptographic Uses
The math/big package naturally and inevitably gets used for cryptographic purposes, including in the standard Go crypto libraries. However, this usage is currently unsafe because math/big does not support constant-time operation and thus may well be leaking secret keys and other sensitive information via timing channels. This is a well-known problem already documented in math/big’s godoc documentation.
A much more specific issue related to this was raised in 2011 (#2445) but eventually closed for lack of attention for too long.
See the preliminary companion patch 45490 presenting a first-cut at an implementation of this proposal: https://go-review.googlesource.com/c/45490/ But the most important details and considerations are discussed here.
Alternative Approaches to Hardening Go’s Cryptographic Code
There are several potential ways to address this weakness in Go’s cryptographic packages, such as:
Change all crypto code to use different big-number-arithmetic libraries that do support constant-time operation. But I’m not aware of any such general-purpose constant-time math library currently in existence for Go, and creating one will likely entail significant work, which as far as I’m aware no one has done yet (or is working on?).
Change all crypto code to use hand-coded constant-time implementations of arithmetic operations specialized to particular groups, elliptic curves, etc., like the crypto/elliptic package currently does specifically for the P256 curve (but not P224, P384, or P521). This approach may yield maximum performance, but again takes considerable effort, and lacks development scalability because this effort is specific to each particular curve or cryptographic scheme of interest. Perhaps most importantly, this effort has not actually been invested - i.e., it’s one of those things “someone really should do” but no one actually does - thus leaving only the “most popular” curve (P256) with an arguably secure implementation and most of the rest of Go’s cryptographic code in a decidedly risky state, a potential security vulnerability minefield.
Change math/big to support constant-time operation within its general-purpose arithmetic framework. This of course is not completely trivial either, but upon some examination does not appear to be either inordinately difficult or particularly invasive. So exploring that option is the purpose of this issue.
Proposed API Modifications for Constant-Time Support
The essence of this proposal, from an API perspective, is simply to add one new property, represented by a getter-setter pair, to the big.Int type, to enable and configure constant-time operation. The key requirement this API must address is that constant-time arithmetic inherently requires the math library to know the maximum possible/allowed size of the result to be produced in a given context, instead of just using the “smallest” internal buffer that can hold the particular numeric value of the result.
A completely tentative signature for the API addition is follows:
func (x *Int) BitCap() int
func (x *Int) SetBitCap(cap int) *Int
A caller invokes SetBitCap on an Int instance to configure that instance to request constant-time operation for subsequent operations invoked on this instance (x), i.e., with x as the destination. The caller must specify via the ‘cap’ argument the maximum number of bits (i.e., “bit-capacity”) of results that will need to be computed into this Int instance. Once configured in this way, the big.Int implementation’s promise to the caller is that for big.Int operations invoked on this instance that support constant-time operation (and not all do or will necessarily be expected to), those operations will be performed such that (a) it will always produce an output result in a buffer sized to hold ‘cap’ bits regardless of the numeric value of the result, and (b) the arithmetic operation’s timing will depend only on the lengths and not the numeric contents of all input operands.
For cryptographic use, it is the responsibility of the caller to ensure that all big.Int instances holding secret keys or other sensitive information, or values computed using such secrets, are configured properly to a fixed size via SetBitCap() before loading sensitive information into them (e.g., via SetBytes or arithmetic operations). When performing an arithmetic operation such as Add, Mul, or Exp on a destination big.Int configured for constant-time operation, the API does not require that all inputs be configured to be the same size, or necessarily to be configured via SetBitCap at all. For example, if crypto code calls z.Add(x,y) in a situation in which operand x is secret/sensitive but operand y is a well-known or otherwise insensitive public value, then operand x should have been likewise configured for constant-time via SetBitCap, but the insensitive operand y need not be. Thus, this API does not create a “separate and incompatible” kind of big.Int for constant-time operation, and allows constant-time and variable-time big.Int instances to be used together as appropriate depending on the sensitivity of particular values in a cryptographic computation. This of course places a burden on the calling crypto code to keep track of which values are sensitive and which aren’t, but the developers of crypto code generally face that burden anyway.
If the caller uses x.SetBitCap to configure a particular bit-capacity, but then invokes an operation (e.g., x.Add) that ends up computing a result too large for the configured bit-capacity, then the big.Int arithmetic operation panics if it detects the too-large result. The modified big.Int API does not promise to detect all such too-large results, however, because it internally rounds the requested bit-capacity up to the next machine word. (Note: if it is deemed desirable to enforce bit-capacity exactly, that would probably not be particularly difficult or expensive to do, so it’s worth considering as an option. A point for discussion.)
The preliminary, basis-for-discussion patch (https://go-review.googlesource.com/c/45490/) includes a change to the crypto/dsa package illustrating the use of the above API. Note that this is just a temporary “for example” for API discussion purposes. The patch does not even pretend to try to revise all the crypto packages, its example revision of the dsa package itself might still be imperfect/incomplete, and revising each of the crypto packages to support constant-time operations should probably be handled as separate changes once the big.Int API is decided on. The main point is that the code incorporates calls to SetBitCap on values such as ‘k’ that hold secret keys and sensitive intermediate values, while leaving big.Ints that hold only public values (such as DSA parameters) in the default variable-time configuration. This is intended to give the impression of the extent of changes I expect would be typically required for generic crypto code such as this to support constant-time operation with the proposed API.
Constant-Time Arithmetic Implementation Challenges
Although the API above is hopefully fairly simple and backward-compatible, we of course face several challenges and decisions in delving into the actual implementation details within the math/big package:
We don’t want the constant-time arithmetic support to slow down non-constant-time arithmetic significantly, e.g., for general-purpose arithmetic that has nothing to do with crypto.
We would prefer that the internal code modifications not be too overly invasive, or to result in extensive code duplication (e.g., a constant-time and non-constant-time version of everything).
The internal ‘nat’ type, which implements most of the arithmetic primitives that big.Int is actually built on, currently assumes fairly extensively that all ‘nat’ values are normalized, meaning the most-significant word in the slice is nonzero. Supporting constant-time operation requires relaxing this assumption at least during constant-time operation, since we need to ensure that ‘nat’ instances containing sensitive data are sized with respect to public information (i.e., the configured bit-capacity) irrespective of the specific numeric value that ‘nat’ instance holds (and in particular how many words it minimally requires to represent that numeric value).
A second-order challenge is that the internal ‘nat’ type is not defined as a struct but merely as a type-alias for a Word slice. This is a problem because the most natural way to “propagate” the bit-capacity configuration from big.Int down into big.nat for constant-time arithmetic purposes would be simply to add a configuration field to the nat type - but that’s not directly possible given that nat is defined as a word-slice and not a struct. We could change nat into a struct, which would fortunately be an invisible to the external API, but would be a rather invasive change throughout the entire math/big package since so much internal code assumes that nat is a word-slice.
Implementation Approach
There are many likely-reasonable approaches to address these challenges, but here are the decisions and major changes reflected in the tentative, for-discussion patch (https://go-review.googlesource.com/c/45490/).
Normalized versus configured-length nat slices
To avoid duplicating much of the current ‘nat’ code into a separate-but-incompatible internal type for constant-time arithmetic, I chose simply to revise the ‘nat’ code and the code that uses it to avoid assuming that a ‘nat’ is always normalized: i.e., to make nat operations be tolerant of non-normalized inputs. In most cases this proves to be not particularly difficult. For example, nat.sub() must be changed no longer to assume that it’s always an underflow in the case (m < n) where the first nat is a shorter word-slice than the second, because the second might now be an un-normalized word-slice with a value smaller than the value in the first. The modified ‘nat’ library still produces normalized values in the case of default, non-constant-time operation, but merely no longer requires or expects this always to be the case.
Since constant-time operations need to produce nat slices of a fixed size potentially larger than their normalized minimum size, we need to parameterize those operations somehow, but as discussed above we cannot easily just add a configuration field to ‘nat’ because nat isn’t a struct and it would be an extremely invasive change to make it one. Thus, the solution I adopted is to add a ‘zcap’ parameter to ‘nat’ operations that (now) support constant-time operation. The caller can pass 0 for the zcap parameter to indicate traditional, variable-time operation.
However, since even changing all calls to ‘nat’ operations (nat.add, nat.mul, etc) throughout the library to add a ‘0’ parameter to all variable-time invocations of these operations would likewise be a fairly invasive change, I (tentatively) chose to minimize this invasiveness, at the expense of a little more code in nat.go, by creating new zcap-parameterized methods named with a ‘c’ prefix (nat.cadd, nat.csub, etc), and including corresponding backward-compatibility methods with the original unmodified signatures that simply call the new ‘c’-prefixed methods with 0 for the zcap parameter. I make no suggestion that this is the ideal solution for the long term; it was intended only to keep the invasiveness of the changes relatively minor for now, and I am completely open in terms of the “right” way to parameterize nat with a configurable bit-capacity.
Constructing and normalizing nat instances
The main, global effect of the new ‘zcap’ parameter to the constant-time calls is to parameterize nat.cmake and nat.cnorm, the new versions of nat.make and nat.norm to support constant-time operation. In particular, nat.cmake now ensures that the nat byte-slice it allocates is big enough for either the caller’s indicated expected maximum result size (n) or for the configured zcap, whichever is larger.
Most importantly, nat.cnorm now normalizes a slice down to the minimum representation size if zcap == 0, but when zcap > 0 it “normalizes” it down to exactly the size corresponding to zcap. Thus, support for constant-time operation effectively just changes the definition of “normalization” slightly: it means “normalize to minimum size” when zcap == 0 and “normalize to exactly zcap size” when zcap > 0. In many parts of the nat code, simply passing along the zcap parameter down to cmake before an operation and cnorm at the end is “mostly” enough to address the buffer-management challenges that constant-time operation presents.
Conditionally preserving variable-time optimizations
Throughout the implementations of various arithmetic operations mostly in nat.go, there are currently optimizations here that are inherently non-constant-time. For example, basicMul avoids calling addMulVVW at all for multiplicand words that happen to be zero. Similarly, mul normalizes intermediate results in several places during its calculations. To avoid hurting the performance of big-integer arithmetic for general non-cryptographic purposes, I didn’t want to remove these optimizations entirely, so instead I made those optimizations conditional on a test for 'zcap > 0’. Thus, the modified big.Int code should not perform noticeably worse under default variable-time operation at least due to lost optimization opportunities.
Note that for strict enforcement of constant-time operation, these tests sometimes assume that the Go compiler preserves natural left-to-right evaluation order for the && and || operators, and I’m not familiar enough with the Go compiler to know yet whether that may be safely relied on. In the example of basicMul, we call addMulVVW ‘if zcap > 0 || d != 0’, which will be properly constant-time if the Go compiler either (a) evaluates this expression to a single ‘bool’ value and then performs only a single conditional branch on the result of the full expression, or (b) uses two conditional branch, first branching on the result of ‘zcap > 0’ and then branching on the result of ‘d != 0’ only if zcap == 0. If the Go compiler were to get overly smart and decide it wants to evaluate and branch on ‘d != 0’ first, however, then it would technically break constant-time operation (though in this case this would probably leave an extremely small and hard-to-exploit timing leak). However, this general risk of an overly-smart compiler interfering with developers’ attempts to express constant-time code is common and not remotely unique to Go.
Conditional calculations in constant time
Besides disabling variable-time optimizations when zcap > 0, a standard challenge in implementing constant-time arithmetic is dealing with conditionals that depend on computed data. For example, in each iteration nat.montgomery must test a data-dependent carry flag and perform a subtraction or not depending on its value. To address these situations, in the case of constant-time operation (zcap > 0) the code adopts the standard solution of computing both possible result values and performing a constant-time conditional copy or “select” of one or the other. The new nat.sel method provides such a constant-time select for internal use. Since computing two results and selecting just one of them can negatively impact performance, the code does this only in the zcap > 0 case, and retains the existing conditional behavior when zcap == 0.
Effects on Architecture-specific Arithmetic Code
In most cases, it appears that supporting constant-time operation in the math/big package should require no modifications to the optimized arithmetic code in architecture-specific assembly, because that code tends to be already designed to use architecture-specific mechanisms for handling carries and such without requiring conditionals.
Some of the generic versions of the arithmetic primitives - e.g., addWW_g - had variable-time implementations due to conditional carry handling, but I fixed those to use the same constant-time carry/overflow handling logic that the corresponding generic vector primitives (e.g., addVV) were already using.
I encountered only one “missing” arithmetic primitive needed for constant-time arithmetic, namely constant-time comparison. For this purpose I added a 'cmpVV_g’, which essentially performs the same operation as a ‘subVV_g’ except (a) it discards the subtraction’s normal numeric output, and hence does not need a target vector; and (b) in addition to computing the carry from the subtraction, it also produces an indicator of whether the result is equal to zero or not.
The current patch only uses this generic subVV_g implementation and does not add a corresponding architecture-specific subVV implementation for any architecture. Doing so should be quite easy and should be a relatively straightforward variation to the subVV code for any given architecture, but this did not seem like a high priority at the moment.
Limitations and Caveats
Once again, the current patch is intended to be a starting point for discussion. It’s definitely not yet a perfect or complete solution to the problem of supporting constant-time operation for cryptographic use. In particular, the current patch has at least the following known limitations/caveats (and probably others that I’ve missed or forgotten to mention):
It supports constant-time arithmetic only for certain big.Int operations, currently listed in the godoc for Int.SetBitCap. (I considered adding godoc notes to each of the relevant operations individually about whether or not they currently support constant-time operation, but that seemed unnecessarily invasive at least for now.)
The code is currently documented as guaranteeing constant-time operation only when the target’s zcap > 0 and all operands and the result are non-negative integers. This minimizes the invasiveness of changes required to all the sign-handling code throughout int.go, and is fairly compatible with the needs of most cryptographic code, which tends to use modular arithmetic on non-negative integers anyway. Most of the sign-handling logic in int.go could in principle be made constant-time if deemed necessary, but sometimes at the performance cost of computing two results and selecting one of them (e.g., the equal-sign and opposite-sign cases in Add and Sub), and given the typical needs of cryptographic computations it’s not clear that constant-time sign-handling would be worth either the development effort or performance costs.
Although the patch adds constant-time arithmetic support for most of the Int and nat arithmetic operations important for crypto (Add, Sub, Sul, Exp), the remaining “crypto-critical” operation that has not yet been converted this way is Div, needed most importantly for its Mod (modular reduction) functionality. This should not be that hard to achieve for the case when only the dividend may contain sensitive data and not the divisor/modulus, and I believe this is the most common situation in most crypto code (where the dividend typically contains secrets but the divisor/modulus is just a public parameter such as the field size or group order). Making Div constant-time in both its operands (dividend and divisor) may be significantly more tricky, since Knuth’s division algorithm strongly depends on the divisor being normalized such that the most-significant-bit is 1. But for most cryptographic purposes it seems likely acceptable for Div to be constant-time only in the dividend and not the divisor (provided of course it’s clearly documented as such).
Another operation currently missing constant-time support is Int.Bytes(). This case presents no significant technical challenge, but rather the API question as to how exactly it should behave when the Int is configured for zcap > 0. I’m inclined to think it should always return a byte-slice exactly large enough to hold the number of bits set by SetBitCap(). But implementing this behavior would mean that SetBitCap can no longer round up to the next even number of words, but must store the bit-capacity in terms of bytes or bits, with all calls down into nat functions doing the conversion.
While all of the standard tests and benchmarks work in the case of variable-time operation, and I have added some tests specifically for constant-time operation (e.g., tests for arithmetic on non-normalized nat inputs), there are not yet extensive tests or benchmarks specifically for the constant-time case.
In the longer term, a related wishlist item would be to have a “modular integer” type, e.g., ModInt, which gets configured for modular arithmetic using a fixed modulus and automatically provides constant-time arithmetic support based on the size of that modulus. Such a facility would both make common modular-arithmetic crypto code simpler (by the caller not needing to call Mod after every operation), and would facilitate more optimized but still general-purpose arithmetic relying on caching information about a fixed modulus, such as Barrett-style modular reduction. The DEDIS advanced crypto library already implements a “modular integer” type of this form (see nist.Int at https://godoc.org/github.com/dedis/kyber/nist), but it is currently non-constant-time because it depends on big.Int, which is non-constant-time. But figuring out whether and what kinds of extensions would be appropriate specifically to support modular arithmetic is probably best left for a separate issue and discussion.
Apologies for the lengthy writeup. At any rate, I hope that this discussion and the corresponding preliminary patch illustrate that although not by any means trivial, it appears to be feasible to make math/big support constant-time operation for cryptographic use with (a) minimal public API change, (b) no significant performance penalty for non-cryptographic variable-time operation, and © not too invasive changes even internally within math/big outside of the nat.go and int.go files. Further, even with the patch’s current known limitations, I would suggest that it already makes math/big a lot safer for cryptographic uses than it currently is, and hence represents a step forward in security.
Comments/discussion welcome. Thanks.
Related news
Red Hat Security Advisory 2024-4429-03 - An update for containernetworking-plugins is now available for Red Hat Enterprise Linux 9.2 Extended Update Support.
Red Hat Security Advisory 2024-1859-03 - OpenShift API for Data Protection 1.3.1 is now available. Issues addressed include a denial of service vulnerability.
Red Hat Security Advisory 2024-0281-03 - Secondary Scheduler Operator for Red Hat OpenShift 1.2.1 for RHEL 9. Issues addressed include a denial of service vulnerability.
Red Hat Security Advisory 2024-1078-03 - An update is now available for Service Telemetry Framework 1.5.4. Issues addressed include a denial of service vulnerability.
Red Hat Security Advisory 2024-0269-03 - An update for run-once-duration-override-container, run-once-duration-override-operator-bundle-container, and run-once-duration-override-operator-container is now available for RODOO-1.1-RHEL-9. Issues addressed include a denial of service vulnerability.
Red Hat Security Advisory 2024-0748-03 - An update for the container-tools:4.0 module is now available for Red Hat Enterprise Linux 8. Issues addressed include a denial of service vulnerability.