Uploaded image for project: 'OpenShift Container Platform (OCP) Strategy'
  1. OpenShift Container Platform (OCP) Strategy
  2. OCPSTRAT-2711

Confidential Clusters with remote attestation - Phase II

XMLWordPrintable

    • Product / Portfolio Work
    • OCPSTRAT-2023OpenShift Confidential Clusters
    • False
    • Hide

      None

      Show
      None
    • False
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Feature Overview (aka. Goal Summary)  

      Deliver the Confidential Cluster Operator as a Developer Preview for OpenShift, enabling early adopters to deploy confidential clusters on Microsoft Azure using AMD SEV-SNP technology.

      This feature provides the first working implementation of automated node attestation and confidential computing capabilities across an entire OpenShift cluster, validated through Azure-specific deployment patterns and integrated with Red Hat build of Trustee for remote attestation.

      Developer Preview enables technical validation, early customer feedback, and de-risks the path to General Availability while maintaining a focused scope on a single cloud provider and TEE technology.

      Goals (aka. expected user outcomes)

      Primary User Types/Personas:

      • Early Adopter Customers (Financial Services, Healthcare, Government): Can deploy and test OpenShift clusters where every node is hardware-attested using AMD SEV-SNP on Azure, validating the solution for their confidential computing requirements
      • OpenShift Field Engineers & Solution Architects: Have a working Developer Preview to demonstrate confidential cluster capabilities and gather customer feedback
      • Partner Teams (Microsoft Azure): Can collaborate on joint customer engagements and validate Azure confidential VM integration with OpenShift
      • QE/Testing Teams: Have automated tests validating confidential cluster installation, attestation flows, and node lifecycle on Azure
        Observable Functionality:
      • Users can install a new OpenShift cluster on Azure where all worker nodes are deployed as Azure Confidential VMs with AMD SEV-SNP enabled
      • The Confidential Cluster Operator automatically attests each node using Red Hat build of Trustee before admitting it to the cluster
      • Nodes that fail attestation are prevented from joining the cluster
      • Users can add additional nodes to the cluster with automatic attestation enforcement
      • Cluster administrators can view attestation status through CLI/API (basic observability)
      • Developer Preview is installable through documented procedures with known limitations clearly communicated

         

      Requirements (aka. Acceptance Criteria):

      Functional Requirements:

      1. Confidential Cluster Operator Implementation
        • Operator packaged and deliverable through OpenShift ecosystem catalog (OperatorHub or alternative distribution for Dev Preview)
        • Operator manages CRD(s) for confidential cluster configuration specific to Azure/AMD SEV-SNP
        • Operator integrates with OpenShift Machine Management to intercept node provisioning
        • Operator enforces attestation policy before node admission to cluster
        • Operator handles node lifecycle: initial attestation, node addition, attestation failure scenarios
        • Operator includes basic health checks and status reporting
      2. Azure AMD SEV-SNP Integration
      3. Red Hat build of Trustee Attestation Integration
        • Operator deploys Red Hat build of Trustee attestation service to verify node measurements
        • Support for both connected (internet-accessible Trustee) and restricted network (customer-deployed Trustee) scenarios
        • Attestation evidence collection includes: platform measurements, firmware versions, SEV-SNP attestation report
        • Configurable attestation policy defining acceptable platform states
        • Retry logic and timeout handling for attestation requests
      4. OpenShift Installation Integration
        • Support for IPI (Installer-Provisioned Infrastructure) installation on Azure
        • Install-time configuration to enable confidential cluster mode
        • Integration with openshift-install to configure confidential VM instance types
        • Bootstrap node attestation during initial cluster creation
        • Documentation of installation prerequisites and Azure subscription requirements
      5. Node Lifecycle Management
        • Automated attestation of nodes during initial cluster installation
        • Automated attestation when scaling worker pools (adding nodes)
        • Handling of attestation failures: node marked as not-ready, administrator notification
        • Re-attestation capabilities if node configuration changes
        • Integration with Machine Management for node replacement scenarios
      6. Observability & Troubleshooting
        • Operator logs attestation attempts, successes, and failures
        • CRD status fields expose attestation state per node
        • CLI commands to query attestation status: oc get confidentialcluster or similar
        • Basic metrics exposed for attestation success/failure rates
        • Troubleshooting guide for common attestation failures
      7. Testing & Validation
        • Automated CI/CD pipeline validating Azure confidential cluster installation
        • E2E tests covering: fresh install, node scaling, attestation failure scenarios
        • Negative testing: tampered nodes, invalid configurations, network failures
        • Test environment with Azure confidential VM quotas allocated
      8. Documentation for Developer Preview
        • Installation guide outside OpenShift docs since Dev Preview features are not officially documented

       

       

      Deployment considerations List applicable specific needs (N/A = not applicable)
      Self-managed, managed, or both Self-managed only; ARO support deferred to future phase 
      Classic (standalone cluster) Yes - primary and only supported deployment model for Dev Preview 
      Hosted control planes Not supported; explicitly out of scope for Dev Preview 
      Multi node, Compact (three node), or Single node (SNO), or all Multi-node (4+ nodes) and Compact (3-node) supported; SNO explicitly not supported in Dev Preview
      Connected / Restricted Network Both supported; requires customer-deployed Trustee instance for restricted network; documented network requirements for attestation service connectivity 
      Architectures, e.g. x86_x64, ARM (aarch64), IBM Power (ppc64le), and IBM Z (s390x) x86_x64 only (Azure AMD SEV-SNP confidential VMs); other architectures explicitly not supported
      Operator compatibility Dependencies: Machine API, Machine Config Operator, Cluster Version Operator; OLM integration or alternative distribution mechanism for Dev Preview
      Backport needed (list applicable versions) N/A - new capability targeting next OpenShift minor release (e.g., 4.X) 
      UI need (e.g. OpenShift Console, dynamic plugin, OCM) Basic CLI/API only for Dev Preview; OpenShift Console integration deferred to future phase
      Other (please specify)  

      Out of Scope

      Explicitly Not Supported in Developer Preview:

      • Other Cloud Providers: AWS and GCP support will be scoped during Phase IV
      • Other TEE Technologies: Intel TDX, ARM CCA, other AMD technologies not supported
      • Managed Services: ARO (Azure Red Hat OpenShift), ROSA, OSD integration
      • Hosted Control Planes: HyperShift/hypershift integration
      • Single Node OpenShift: SNO deployment model
      • Production Support: No production SLA, best-effort support only
      • Upgrade Paths: No defined upgrade from Dev Preview to future releases
      • Advanced Observability: Console UI, Prometheus dashboards, alerts

      Background

      Phase I established the architecture, upstream repository, and socialized the confidential cluster approach. Phase II delivers the first working implementation, focusing on a single cloud provider (Azure) and TEE technology (AMD SEV-SNP) to validate the architecture, gather early customer feedback, and de-risk the multi-cloud GA release

      Customer Considerations

      Developer Preview Limitations Customers Must Accept:

      • No Production Use: Developer Preview is for testing and validation only; not supported for production workloads
      • Limited Cloud Support: Only Azure; customers with multi-cloud strategies must wait for future phases
      • Breaking Changes Possible: APIs, CRDs, and configurations may change between Dev Preview and GA
      • Best-Effort Support: No SLA, support provided on best-effort basis through community channels
      • No Upgrade Path: Clusters must be re-deployed when GA releases; no in-place upgrade

              mak.redhat.com Marcos Entenza Garcia
              mak.redhat.com Marcos Entenza Garcia
              None
              Clement Verna, Nitesh Narayan Lal
              Timothée Ravier Timothée Ravier
              Yalan Zhang Yalan Zhang
              Avani Bhatt Avani Bhatt
              Kyle Walker Kyle Walker
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated: