Uploaded image for project: 'OpenShift Virtualization'
  1. OpenShift Virtualization
  2. CNV-22161

[2139234] virt pods not in Ready state after setting tlsSecurityProfile `Modern` in HCO

XMLWordPrintable

    • NEXT UpstreamCI Platfrm Sprint
    • Moderate
    • No

      Description of problem:
      After setting Modern tls profile in HCO all virt system pods become in non-ready state:

      > virt-api-847b79c85b-gmd6p 0/1 Running 0 37m
      > virt-api-847b79c85b-v6fhd 0/1 Running 0 37m
      > virt-controller-64f9448b66-pwvcz 0/1 Running 6 (24s ago) 9h
      > virt-controller-64f9448b66-sw48x 0/1 Running 6 (25s ago) 9h
      > virt-exportproxy-744f8cc64b-7czbl 0/1 Running 0 9h
      > virt-exportproxy-744f8cc64b-tvf65 0/1 Running 0 9h
      > virt-handler-2flcz 0/1 Running 6 (35s ago) 4d15h
      > virt-handler-7ks9d 0/1 Running 7 (11s ago) 4d16h
      > virt-handler-jcfr2 0/1 Running 6 (39s ago) 4d16h
      > virt-operator-747b5f58f8-7gv78 0/1 Running 0 9h
      > virt-operator-747b5f58f8-r2929 0/1 Running 0 9

      There are many error messages in pod logs:

      >

      {"component":"virt-api","level":"info","msg":"http: TLS handshake error from 10.129.2.2:57388: tls: client offered only unsupported versions: [303]\n","pos":"server.go:3197","timestamp":"2022-11-01T22:02:14.590402Z"}

      I see the similar errors in other CNV components (SSP, CDI and HCO itself).

      And it can be tricky to revert TLS cofiguration back because webhooks is also not available:

      > $ oc edit hco
      > error: hyperconvergeds.hco.kubevirt.io "kubevirt-hyperconverged" could not be patched: Internal error occurred: failed calling webhook "validate-hco.kubevirt.io": failed to call webhook: Post "https://hco-webhook-service.openshift-cnv.svc:4343/validate-hco-kubevirt-io-v1beta1-hyperconverged?timeout=10s": remote error: tls: protocol version not supported

      Version-Release number of selected component (if applicable):
      4.12

      How reproducible:

      Steps to Reproduce:
      1. edit HCO and set tlsSecurityProfile to 'Modern'
      2. check pods status and logs

      Actual results:
      all CNV components have 'TLS handshake error' errors

      Expected results:
      CNV components active and accessible with Modern profile

      Additional info:
      After some investigation it looks like the issue happens only on a cluster with FIPS mode enabled.

              sgott@redhat.com Stuart Gott
              dshchedr@redhat.com Denys Shchedrivyi
              Kedar Bidarkar Kedar Bidarkar
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated:
                Resolved: