Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-48436

4.16 installer fails Cluster API network validation, when NSX port group are in use

XMLWordPrintable

    • Important
    • None
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      4.16 installer fails Cluster API network validation, when NSX port group are in use, with the following error:
      
      FATAL failed to fetch Cluster API Machine Manifests: failed to generate asset "Cluster API Machine Manifests": unable to retrieve network names: network '/<datacenter>/host/<compute_cluster>' not found
      
      The above is seen in a 4.16.27 deployment, with an empty `.platform.vsphere.failureDomains[].topology.networks` key value.
      - NOTE: a similar error is seen when `.platform.vsphere.failureDomains[].topology.networks` points to a NSX port group *name*:
      i.e. 
      
      $ openshift-install create manifests
      WARNING imageContentSources is deprecated, please use ImageDigestSources
      INFO Consuming Install Config from target directory
      I0114 13:11:32.100352  558355 session.go:236] "Created and cached vSphere client session" server="pn014vvcsw12f.iaas.bd.rijksweb.nl" datacenter="" username="chpuser@vsphere.local"
      FATAL failed to fetch Cluster API Machine Manifests: failed to generate asset "Cluster API Machine Manifests": unable to retrieve network names: network '/IAAS/host/cl028s/<insert networkname>' not found

      Version-Release number of selected component (if applicable):

      4.16.27    

      How reproducible:

      Always

      Actual results:

      Port group validation fails during the manifests creation of the installer.

      Expected results:

      Port group validation to succeed.

      Additional info:

      While the same configuration is seen to work in a 4.14 environment, while using the 4.16.27 installer and the cluster deployment relies on NSX PortGroups, we're seemingly hitting the following issue [0]:
      
      
      // With NSX, Portgroups can have the same name, even within the same Switch. In this case, using an inventory path
      // results in a MultipleFoundError. A MOID, switch UUID or segment ID can be used instead, as both are unique.
      // See also: https://kb.vmware.com/s/article/79872#Duplicate_names
      // Examples:
      // - Name:                "dvpg-1"
      // - Inventory Path:      "vds-1/dvpg-1"
      // - Cluster Path:        "/dc-1/host/cluster-1/dvpg-1"
      // - ManagedObject ID:    "DistributedVirtualPortgroup:dvportgroup-53"
      // - Logical Switch UUID: "da2a59b8-2450-4cb2-b5cc-79c4c1d2144c"
      // - Segment ID:          "/infra/segments/vnet_ce50e69b-1784-4a14-9206-ffd7f1f146f7"
      
      [0] https://github.com/vmware/govmomi/blob/6ac4eabbbd6deb6bbad6b2846944c5343c31ca56/find/finder.go#L812-L836

       

       

       

              jcallen@redhat.com Joseph Callen
              rhn-support-rsandu Robert Sandu
              Shang Gao Shang Gao
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated:
                Resolved: