Uploaded image for project: 'Migration Toolkit for Virtualization'
  1. Migration Toolkit for Virtualization
  2. MTV-1861

DNS resolution is not working if a custom transfer network is used for migration

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • None
    • 2.7.7
    • Controller
    • None
    • False
    • None
    • True
    • Important

      Description of problem:

      Created a NAD for transfer network using bridge cni:

      apiVersion: "k8s.cni.cncf.io/v1"
      kind: NetworkAttachmentDefinition
      metadata:
        name: mtv-transfer-bridge-network-with-dns
        annotations:
          k8s.v1.cni.cncf.io/resourceName: bridge.network.kubevirt.io/br5
      spec:
        config: |
          {
            "cniVersion": "0.3.1",
            "name": "bridge-network",
            "type": "bridge",
            "bridge": "br5",
            "ipam": {
              "type": "whereabouts",
              "range": "192.168.222.100-192.168.222.150/24",
              "gateway": "192.168.222.1",
              "routes": [
                 {
                   "dst":"0.0.0.0/0"
                 }
              ],
              "dns":{
               "nameservers":[
                  "192.168.222.1"
               ]
               }
            }
          }

      Selected this NAD as a transfer network after creating the plan.  The migration failed with following error while it tries to resolve the vCenter FQDN:

      # oc logs nijin-vm-vm-43263-w4v7q
      Defaulted container "virt-v2v" out of: virt-v2v, vddk-side-car (init)
      exec: /usr/bin/virt-v2v -v -x -o kubevirt -os /var/tmp/v2v -i libvirt -ic vpx://admin%40gsslab.pnq@vcenter.vmware.test.redhat.com/OpenShift-DC/host/OCP/esxi-2.vmware.test.redhat.com?no_verify=1 -ip /etc/secret/secretKey --root first -it vddk -io vddk-libdir=/opt/vmware-vix-disklib-distrib -io vddk-thumbprint=8C:90:CF:E7:89:16:1D:91:AE:3B:73:DA:7F:A7:61:56:39:AC:6D:15 -- nijin-cloud-init
      virt-v2v monitoring: Setting up prometheus endpoint :2112/metrics
      virt-v2v monitoring: Prometheus progress counter registered.
      info: virt-v2v: virt-v2v 2.5.6rhel=9,release=7.el9_5 (x86_64)
      info: libvirt version: 10.5.0
      check_host_free_space: large_tmpdir=/var/tmp free_space=19280736256
      [   0.0] Setting up the source: -i libvirt -ic vpx://admin%40gsslab.pnq@vcenter.vmware.test.redhat.com/OpenShift-DC/host/OCP/esxi-2.vmware.test.redhat.com?no_verify=1 -it vddk nijin-cloud-init
      virt-v2v: error: exception: libvirt: VIR_ERR_INTERNAL_ERROR: VIR_FROM_ESX: internal error: IP address lookup for host 'vcenter.vmware.test.redhat.com' failed: Name or service not known
      rm -rf -- '/tmp/v2v.Sb4JMY'
      Error executing v2v command: exit status 1
      Failed to execute virt-v2v command: exit status 1

      The issue is because the pod have DNS server configured as OpenShift service IP "172.30.0.10". Since the pod has annotation "v1.multus-cni.io/default-network: nijin-cnv/mtv-transfer-bridge-network-with-dns", pod network will not be attached to the pod. So it won't be able to contact the DNS server. 

      # oc rsh nijin-vm-vm-43263-w4v7q
      Defaulted container "virt-v2v" out of: virt-v2v, vddk-side-car (init)
      
      sh-5.1$ ip r show
      default via 192.168.222.1 dev eth0
      192.168.222.0/24 dev eth0 proto kernel scope link src 192.168.222.100
      
      sh-5.1$ cat /etc/resolv.conf
      search openshift-mtv.svc.cluster.local svc.cluster.local cluster.local deneb.tt.testing
      nameserver 172.30.0.10
      options ndots:5

      Although, I have a custom DNS server which is in the same network configured in the NAD, it looks like the CRI-O won't consider the dns being passed from the CNI. I tried  this with multiple CNI plugins and the DNS of the pod was always the internal 172.30.0.10. I can also confirm that the CNI plugin is passing the nameserver values. This https://github.com/cri-o/cri-o/issues/1204 is a very old discussion around this, but cannot see anything after that.

      Version-Release number of selected component (if applicable):

      Migration Toolkit for Virtualization Operator   2.7.7    

      How reproducible:

      100%    

      Steps to Reproduce:

      1. Create a NAD for transfer network. Ensure that the provider is added with FQDN.
      2. Create a plan and select this NAD for transfer network.
      3. Start the migration. Migration fails with the above error.

              Unassigned Unassigned
              rhn-support-nashok Nijin Ashok
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated: