Uploaded image for project: 'OpenShift Virtualization'
  1. OpenShift Virtualization
  2. CNV-15550

[2036027] CNV 4.9.1|VMs deployments are failing due to webhook context deadline timout

XMLWordPrintable

    • CNV Virtualization Sprint 231, CNV Virtualization Sprint 232, CNV Virtualization Sprint 239, CNV Virtualization Sprint 240, CNV Virtualization Sprint 241
    • High
    • No

      Some background:
      -------------------------
      I'm running a scale setup with 84 nodes, and I'm using OCS cpeh-rbd as backend storage.
      I'm attempting to deploy a large amount of VM's, but I noticed some VM's are missing,
      This issue is a real problem for me since it breaks my measurements & prevents VM deployment.
      looking at the virt-controller logs we can see the following prints:

      {"component":"virt-controller","level":"info","msg":"re-enqueuing VirtualMachine default/master-0-win10-vm0075","pos":"vm.go:175","reason":"Internal error occurred: failed calling webhook \"virtualmachine-validator.kubevirt.io\": Post \"https://virt-api.openshift-cnv.svc:443/virtualmachines-validate?timeout=10s\": context deadline exceeded","timestamp":"2021-12-29T10:26:12.657482Z"} {"component":"virt-controller","kind":"","level":"error","msg":"Updating api version annotations failed","name":"master-0-win10-vm0043","namespace":"default","pos":"vm.go:209","reason":"Internal error occurred: failed calling webhook \"virtualmachine-validator.kubevirt.io\": Post \"https://virt-api.openshift-cnv.svc:443/virtualmachines-validate?timeout=10s\": context deadline exceeded","timestamp":"2021-12-29T10:26:17.599926Z","uid":"ace78d0a-482c-43f8-bb87-c2d16450467b"} {"component":"virt-controller","level":"info","msg":"re-enqueuing VirtualMachine default/master-0-win10-vm0043","pos":"vm.go:175","reason":"Internal error occurred: failed calling webhook \"virtualmachine-validator.kubevirt.io\": Post \"https://virt-api.openshift-cnv.svc:443/virtualmachines-validate?timeout=10s\": context deadline exceeded","timestamp":"2021-12-29T10:26:17.599989Z"} {"component":"virt-controller","kind":"","level":"error","msg":"Updating api version annotations failed","name":"master-0-win10-vm0002","namespace":"default","pos":"vm.go:209","reason":"Internal error occurred: failed calling webhook \"virtualmachine-validator.kubevirt.io\": Post \"https://virt-api.openshift-cnv.svc:443/virtualmachines-validate?timeout=10s\": context deadline exceeded","timestamp":"2021-12-29T10:26:18.298490Z","uid":"467ab09a-e3d0-4a09-8814-52d0ea91dd13"} {"component":"virt-controller","level":"info","msg":"re-enqueuing VirtualMachine default/master-0-win10-vm0002","pos":"vm.go:175","reason":"Internal error occurred: failed calling webhook \"virtualmachine-validator.kubevirt.io\": Post \"https://virt-api.openshift-cnv.svc:443/virtualmachines-validate?timeout=10s\": context deadline exceeded","timestamp":"2021-12-29T10:26:18.298552Z"} {"component":"virt-controller","kind":"","level":"error","msg":"Updating api version annotations failed","name":"master-0-win10-vm0007","namespace":"default","pos":"vm.go:209","reason":"Internal error occurred: failed calling webhook \"virtualmachine-validator.kubevirt.io\": Post \"https://virt-api.openshift-cnv.svc:443/virtualmachines-validate?timeout=10s\": context deadline exceeded","timestamp":"2021-12-29T10:26:22.670068Z","uid":"6640416a-bdad-450e-b11f-38508a3af158"}

      [root@e26-h01-000-r640 ~]# oc logs virt-controller-655db5c9cf-rdqfg|grep "Internal error"|wc -l
      148

      another thing I have to mention is that we never reached the 10s timeout, in most cases we get the "deadline exceeded" almost immediately after submitting the deployment request (via YAML).

      Versions of all relevant components:
      ===================================
      CNV 4.9.1
      OCS 4.9.0
      LSO 4.9.0-202111151318
      OCP 4.9.12

      must-gather:
      ============
      http://perf148h.perf.lab.eng.bos.redhat.com/share/BZ_logs/cnv_must_gather_failed_calling_webhook.tar.gz

              ibezukh Igor Bezukh
              bbenshab Boaz Ben Shabat
              Guy Chen Guy Chen
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated:
                Resolved: