Headline
Confidential Containers on Azure with OpenShift: A technical deep dive
Red Hat OpenShift sandboxed containers has taken a significant step forward in workload and data security by adopting the components and principles of the CNCF Confidential Containers (CoCo) open source project and the underlying Trusted Execution Environment (TEE) technology. The first blog in the series introduced the OpenShift sandboxed containers with support for confidential containers solution on Microsoft Azure and targeted use cases.
Learn more about Confidential Containers
In this blog, we’re focusing on the specifics of the CoCo components. We’ll break down the major elements,
Red Hat OpenShift sandboxed containers has taken a significant step forward in workload and data security by adopting the components and principles of the CNCF Confidential Containers (CoCo) open source project and the underlying Trusted Execution Environment (TEE) technology. The first blog in the series introduced the OpenShift sandboxed containers with support for confidential containers solution on Microsoft Azure and targeted use cases.
In this blog, we’re focusing on the specifics of the CoCo components. We’ll break down the major elements, delve into the remote attestation process, secure key release for the application and highlight the role of Azure Confidential VM’s (CVM) virtual Trusted Platform Module (vTPM) with AMD SEV-SNP protection.
Read on for a detailed exploration of how confidential containers leverage these technologies to protect your sensitive code and data while in use.
CoCo building blocks for OpenShift in Azure
The following diagram shows the components involved in an OpenShift cluster to provide the capability to run confidential containers:
Figure 1: OpenShift confidential containers components in Azure
The worker nodes are equipped with the Kata-runtime having remote hypervisor support, and work in conjunction with cloud-api-adaptor (a.k.a peer pods) to create Pods backed by confidential Virtual Machines (DCasv5, ECasv5 Azure VM’s).
Let’s look at the components shown in Figure 1, before we dive into understanding how the attestation is performed.
Kata runtime: The container runtime that can bring up lightweight virtual machines to run the pod’s containers, instead of using runc to invoke Linux containers on the same worker node.
Cloud API adaptor: An administrative application, deployed onto an OpenShift cluster as a daemonset. It manages the lifecycle of cloud instances and facilitates communication between the cluster and those instances.
Kata-Agent: The guest-VM-component of Kata, responsible for bringing up containers within the virtual machine (VM).
Agent protocol forwarder: A service running inside the VM to expose kata-agent over a TCP socket.
Attestation Agent (AA): A process on the VM, providing a local facility for secure key release to the Pod. It exposes its functionality as a gRPC service and as an Attester in a Request-Challenge-Attestation-Response exchange it gathers evidence to prove the confidentiality of its environment. The evidence is then passed to the Key Broker Service (described below), along with the request for a specific key.
Key Broker Service (KBS): A discrete remotely-deployed service acting as a Relying Party. It manages access to a set of secret keys and will release those keys depending on the authenticity of the Evidence provided by the Attester and conformance with pre-defined policies. KBS is a remote attestation entry point that integrates the Attestation Service (described below) to verify Trusted Execution Environment (TEE) evidence. KBS supports two attestation modes: “Native Attestation Service” mode, which integrates the AS library crate and a “Remote Attestation Service” mode, which connects to a standalone gRPC Attestation Service server.
Attestation Service (AS): A component responsible for verifying that the Evidence provided to the KBS by an Attester is genuine. AS verifies the evidence signature and the Trusted Computing Base (TCB) described by that evidence. Upon successful verification, it will extract facts about the TEE from the given Evidence and provide it back as a uniform claim to the KBS. It can be deployed as a discrete service or integrated as a module into a KBS deployment.
Note that the Confidential Containers project is adhering to the procedures and terminology described in IETF Remote Attestation Procedure (RATS). In RATS terminology, the Attestation Agent is the Attester, Key Broker Service is the Relying party and Attestation Service is the Verifier. For more details you can refer to the following blog. The following table describes where the individual components reside and whether they are part of a TCB, a foundation for confidential workloads that we consider trustworthy.
Component
TCB
Runs In
Kata-Runtime
❌
OpenShift worker node
Cloud API Adaptor
❌
OpenShift worker node
Kata-Agent
✅
Peer-pod VM (Confidential VM)
Agent Protocol Forwarder
✅
Peer-pod VM (Confidential VM)
Attestation Agent (AA)
✅
Peer-pod VM (Confidential VM)
Key Broker Service (KBS)
✅
Customer’s premises or hosted services
Attestation Service (AS)
✅
Customer’s premises or hosted services
Attestation flow
The following diagram describes the attestation workflow:
Figure 2: Attestation workflow
We have an application running inside a confidential pod (backed by a confidential VM) which requires a secret key. The Confidential VM’s (CVM’s) Attestation Agent (AA) GRPC API [1] is invoked with GetResourceRequest, specifying a Resource Identifier and a URL pointing to a Key Broker Service (KBS) installation. AA requests [2] the given resource (e.g. application secret key) from the KBS. The KBS answers with a cryptographic nonce [3], which is required to be embedded in the Evidence to ensure this particular exchange cannot be replayed. The KBS protocol is vendor and hardware-agnostic and supports a simple, HTTP-based, OpenAPI 3.1 compliant API. The AA sends the evidence [4] to the KBS and on successful verification [5] the KBS releases the key to the AA [6]. The AA then provides the key to the application [7].
The following sections will go deeper into the Evidence generation and related aspects of the attestation workflow.
A behind-the-scenes look at attestation in the Azure CVM
The Azure Confidential Virtual Machine (CVM) components involved in the attestation process are depicted in the diagram below:
Figure 3: Azure Confidential VM on AMD SEV-SNP host
Let’s go over the components shown in Figure 3:
Host Compatibility Layer (HCL)
The Host Compatibility Layer (HCL) is a firmware component that is used to manage a set of virtual devices in an Azure CVM. One of the devices that the HCL manages is a virtual TPM device (vTPM).
Virtual TPM device (vTPM)
A virtual TPM device (vTPM) is a software implementation of a Trusted Platform Module (TPM). A TPM is a hardware security module that provides cryptographic operations and other security features. The vTPM in an Azure CVM is part of the remote attestation procedure, which is a process of verifying the trustworthiness of the VM.
Virtual Machine Privilege Levels (VMPL)
Memory access on AMD SEV-SNP VMs can be segmented according to VMPL. As the HCL is being executed in a higher privilege level (VMPL 0), the guest (VM) OS (VMPL 2) is restricted to access the vTPM via its public interface, while the vTPM internals remain opaque.
Versioned Chip Endorsement Key (VCEK)
The VCEK is a key that is embedded in the AMD Platform Security Processor (PSP). The PSP is a security processor that is used to protect the confidentiality and integrity of the CPU. The VCEK is used to sign attestation reports that are generated by the PSP.
SNP Attestation Report (AR)
The SNP AR is a document that contains information about the state of the VM. For more details refer to the following whitepaper. The SNP AR is retrieved from the AMD PSP and is signed by VCEK.
Attestation Key (AK)
It’s a TPM signing key that is derived from the Endorsement Seed Key (EK) which is an asymmetric key pair unique to each vTPM and stored in non-volatile memory of vTPM. It is used to sign vTPM quotes.
vTPM quote generation is described in the diagram below:
Figure 4: vTPM quote generation in an Azure Confidential VM
Platform Configuration Registers (PCR)
A TPM maintains a set of numbered PCR which store hashes in a particular fashion. A given PCR_n (n > 0) can only be written using an operation that “extends” the hash in the previous PCR_n-1. The mechanism is used to build a trust chain of components in a boot process.
vTPM Quote
A vTPM can certify a selection of PCRs by signing them with its AK. This signed statement of the PCRs is referred to as the quote. When requesting a TPM quote, a nonce can be provided. A nonce is a random number with specific traits that are mixed into cryptographic exchanges to prevent replay attacks. The operation produces a signed “message” containing a digest of the PCR values and the provided nonce. A vTPM quote consists of the message and its signature.
Attestation Evidence
Once the vTPM quote is available, it’s used by the Attestation Agent to produce the Evidence. Attestation Evidence is a collection of data used to verify the system’s trustworthiness. The Evidence that is used to verify the trustworthiness of an Azure CVM is composed of the following:
- An SNP AR
- The public part of the vTPM’s AK
- A TPM quote of the vTPM’s PCRs and the KBS-provided nonce
- A PEM-encoded X.509 VCEK Certificate
The components involved in the generating the Evidence in an Azure CVM are shown in the diagram below:
Figure 5: Evidence generation in an Azure Confidential VM
As you can see in Figure 5, the Evidence is generated by the following steps:
- HCL retrieves an AR from the PSP, supplying the vTPM’s AK as report data. Report data is a block of arbitrary data supplied by the guest VM as part of the report request.
- The vTPM generates a quote from the TPM PCRs and the KBS-provided nonce
- The Azure Instance Metadata Service (IMDS) provides a VCEK certificate of the host CPU.
- Attestation Agent sends the Evidence to KBS for verification
Verification of Evidence by KBS
The KBS passes the received Evidence to the Attestation Service (AS) component to verify if the evidence is genuine ([5] in Figure 2). The individual verifications performed on the evidence are:
- The TPM quote has been signed by the TPM’s AK (the public part of it is part of the evidence)
- The VCEK has been signed by AMD
- The AR has been signed by the VCEK
- The AR has been retrieved in privilege level VMPL0
- The hashed TPM AK matches the AR’s Report Data field
Eventually AS yields a positive or negative Attestation Response back to KBS.
KBS releases the key (secret)
On a positive Attestation Response, KBS will retrieve the requested key resource from its storage repository and return it to the Attestation Agent ([6] in Figure 2). AA passes its back ([7] in Figure 2) as a response to the container’s initial gRPC request. The application can then consume the secret.
Usage patterns
Here we discuss some of the usage patterns enabled by the attestation and the key release workflow described in the previous sections. The usage patterns described here involve talking to the Attestation Agent. The process of attestation and secure key release after calling the Attestation Agent looks the same in both the cases as discussed earlier.
Application driven secure key release
The following diagram describes the application driven secure key release scenario:
Figure 6: Secure key release to application after attestation
In this scenario a generic application is run in the TEE but for it to run it has to access the data which could be encrypted or stored in a location that can only be accessed using a secret. In such cases the application or pod’s init-container or sidecar can do the process of “secure key release” on behalf of the application so that the application can start processing the data.
Using sidecar or init-container enables “lift and shift” scenarios where the application is agnostic of the data it is processing. So if the workload were first running inside a non-TEE environment, it can run here without changes to the codebase.
It does not limit an application from building a logic to do a key release itself. It is just that Kubernetes enables scenarios where the common functionality like a key release can be abstracted out and the applications can remain nimble and interoperable by moving the infrastructure logic to special applications running as sidecars or init-containers.
Encrypted container images
Another way to ensure your application only runs in a valid TEE is by using encrypted container images and decrypting it inside the CVM as shown in the following diagram:
Figure 7: Secure key release to kata-agent after attestation for decrypting container images
In this scenario kata-agent would be able to run the workloads only if it can decrypt the container images. The kata-agent can decrypt the container images only if it can do a secure key release for decrypting the container image.
Conclusion and outlook
This blog provided a technical overview of confidential containers with OpenShift on Azure. We described the attestation and key release flow in the peer pod VM backed by Azure Confidential VM. While we’ve covered the essential steps in this process, it’s important to note that there are more aspects of the broader confidential container workflow that weren’t covered. As the technology evolves and stabilizes, we’ll explore other critical components in greater detail in future blogs.
Stay tuned for our upcoming blogs on measurements, reference values and secure key storage, which are essential to ensuring a trusted environment and secure key release flow.