Uploaded image for project: 'OpenShift Request For Enhancement'
  1. OpenShift Request For Enhancement
  2. RFE-5711

Allow Hypershift Operator To Pull From ECR Credential Provider

XMLWordPrintable

    • False
    • None
    • False
    • Not Selected

      1. Proposed title of this feature request
      Allow Hypershift Operator To Pull From ECR Credential Provider

      2. What is the nature and description of the request?

      We propose that the Hypershift operator is given the ability either to create CredentialProviderRequests (see KEP-2133) and thus use the existing ecr-credential-provider binary that exists (possibly via the node Kubelet API), or by calling the EC2 IMDS service directly via the AWS SDK (see ECR credential provider library).

      This will allow us to give Control and Data Plane nodes the ability to pull release images and metadata from our ECR container registry without updating the global pull secret every 12 hours, and keeping all traffic off the public internet.

      3. Why does the customer need this? (List the business requirements here)

      A key deliverable for ROSA HCP this year is the ability to launch Hosted Control Plane clusters without needing to expose a public NAT or Internet Gateway, or to expose any egresses to public internet URLs in customer firewalls.

      The user story, from XCMSTRAT-422, is as follows:

      As a ROSA HCP cluster administrator (network admin, security admin, etc), I want to ensure that I can tightly control my cluster's network egress traffic and ensure that no cluster network egress traffic requires the public internet, so that I can continue to meet my corporate/compliance/security requirements.

      In our design specification, we have settled on a flow that uses Amazon Elastic Container Registry (ECR) as a mirror for quay.io release images, following the example set out for disconnected installations in the Hypershift Documentation.

      The basic flow proposed:

      • Customer creates service endpoints to AWS ECR, S3, EC2 so that traffic to AWS APIs stays within Amazon network
      • ROSA CLI and/or customer creates necessary account roles (see Classic ROSA Example)
      • rosa create cluster –hosted-cp –no-egress is run by customer 
      • OCM API creates HostedCluster object with regional ICSP/IDMS pointing to release mirror, hosted on ECR (1:1 copy of quay.io/…./ocp-art)
      • Management Cluster uses existing IDMS/ICSP configuration to generate ignition configuration for initial worker nodepool to start up
      • Workers pull release images via ECR mirror, bypassing any routes that touch the public internet. 
      • Authentication uses EC2 IAM identity based on Worker Nodes based on AmazonEC2ContainerRegistryReadOnly policy 

      Without this RFE, this higher-level XCMSTRAT is blocked, as our NodePool update strategy does not use the InPlace strategy. Thus, any ECR pull-secret update that happens every 12 hours would result in Node Pools being restarted roughly every 12 hours, which is not a desirable UX for customers or for SREP.

      4. List any affected packages or components.

      • Hypershift Operator

              azaalouk Adel Zaalouk
              drow.openshift.srep Dustin Row
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated:
                Resolved: