Uploaded image for project: 'OpenShift Installer'
  1. OpenShift Installer
  2. CORS-3061

Provision Azure with CAPI (no mgmt cluster)

XMLWordPrintable

    • Icon: Epic Epic
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • None
    • None
    • Azure CAPI Install
    • False
    • None
    • False
    • Not Selected
    • To Do
    • OCPSTRAT-914 - Remove Terraform from the Azure IPI installer
    • OCPSTRAT-914Remove Terraform from the Azure IPI installer
    • 37% To Do, 30% In Progress, 33% Done
    • Hide

      April 30: We have established proof-of-concept installs and are nearing 4.17 branching, with the possibility of Azure landing as Tech Preview. It is a good time to capture known outstanding work so we can work toward planning a 4.17 GA.

       

      • Outbound Access: poc uses an outbound load balancer to provide egress to control plane nodes. Keeping a third load balancer around for egress is unnecessary, so we should be able to handle this more efficiently.
      • Need to add Bootstrap node public IP for SSH access. We will also need to delete the SSH security rule on bootstrap destroy. Captured in CORS-3302
      • Machine Provisioning is too slow. Recently a 15 minute timeout was added to each CAPI provisioning period. Apparently the Azure machines are far exceeding this timeout. We need to debug why machines are taking so long to provision.
      • Need to make image creation concurrent (or remove the need altogether). Currently uploading the image and creating resources takes ~11 minutes, which isn't terrible, but is  noticeably slow as it blocks other progress. We should try to make this less noticeably slow.
      • Compute nodes are using nat gateways for outbound access. This may not be desirable from a cost perspective, and we should allow this to be disabled if possible. May require upstream work. This may be captured by CORS-3074
      • We need to support multiple IPs per load balancer a la ARO: RFE-4561
      Show
      April 30: We have established proof-of-concept installs and are nearing 4.17 branching, with the possibility of Azure landing as Tech Preview. It is a good time to capture known outstanding work so we can work toward planning a 4.17 GA.   Outbound Access: poc uses an outbound load balancer to provide egress to control plane nodes. Keeping a third load balancer around for egress is unnecessary, so we should be able to handle this more efficiently. Need to add Bootstrap node public IP for SSH access. We will also need to delete the SSH security rule on bootstrap destroy. Captured in CORS-3302 Machine Provisioning is too slow. Recently a 15 minute timeout was added to each CAPI provisioning period. Apparently the Azure machines are far exceeding this timeout. We need to debug why machines are taking so long to provision. Need to make image creation concurrent (or remove the need altogether). Currently uploading the image and creating resources takes ~11 minutes, which isn't terrible, but is  noticeably slow as it blocks other progress. We should try to make this less noticeably slow. Compute nodes are using nat gateways for outbound access. This may not be desirable from a cost perspective, and we should allow this to be disabled if possible. May require upstream work. This may be captured by CORS-3074 We need to support multiple IPs per load balancer a la ARO: RFE-4561

      OCP/Telco Definition of Done
      Epic Template descriptions and documentation.

      <--- Cut-n-Paste the entire contents of this description into your new Epic --->

      Epic Goal

      • Provision Azure infrastructure without the use of Terraform

      Why is this important?

      • Removing Terraform from Installer

      Scenarios

      1. The new provider should aim to provide the same results as the existing Azure
      2. terraform provider.

      Acceptance Criteria

      • CI - MUST be running successfully with tests automated
      • Release Technical Enablement - Provide necessary release enablement details and documents.
      • ...

      Dependencies (internal and external)

      1. ...

      Previous Work (Optional):

      Open questions::

      Done Checklist

      • CI - CI is running, tests are automated and merged.
      • Release Enablement <link to Feature Enablement Presentation>
      • DEV - Upstream code and tests merged: <link to meaningful PR or GitHub Issue>
      • DEV - Upstream documentation merged: <link to meaningful PR or GitHub Issue>
      • DEV - Downstream build attached to advisory: <link to errata>
      • QE - Test plans in Polarion: <link or reference to Polarion>
      • QE - Automated tests merged: <link or reference to automated tests>
      • DOC - Downstream documentation merged: <link to meaningful PR>

            jhixson_redhat John Hixson
            padillon Patrick Dillon
            Jinyun Ma Jinyun Ma
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated: