Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-28735

Multus doesn't refresh certificates after node was suspended for 30 days

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Duplicate
    • Icon: Undefined Undefined
    • None
    • 4.14.z, 4.15.0, 4.16.0
    • Networking / multus
    • None
    • No
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      Multus doesn't issue a CSR to get new certificates when node is suspended for 30 days   

      Version-Release number of selected component (if applicable):

          

      How reproducible:

          

      Steps to Reproduce:

          1. Setup a libvirt cluster on machine
          2. Disable chronyd on all nodes and host machine
          3. Suspend nodes
          4. Change time on host 30 days forward
          5. Resume nodes
          6. Wait for API server to come up
          7. Wait for all operators to become ready
          

      Actual results:

      Multus would attempt to use expired certs:
      
      2024-01-21T01:24:15.456299440+00:00 stderr F 2024-01-21T01:24:15Z [verbose] DEL finished CNI request ContainerID:"f01434ff66b5571923e23aa1696bca1bc4b63b5e89d9b84bb4965c8d599a9dc9" Netns:"/var/run/netns/313a63fa-7765-4f9e-b330-643c8c3e08d2" IfName:"eth0" Args:"IgnoreUnknown=1;K8S_POD_NAMESPACE=openshift-machine-config-operator;K8S_POD_NAME=kubelet-bootstrap-cred-manager-msgls;K8S_POD_INFRA_CONTAINER_ID=f01434ff66b557192
      3e23aa1696bca1bc4b63b5e89d9b84bb4965c8d599a9dc9;K8S_POD_UID=3133b172-dd21-4d05-9662-22c0841c9821" Path:"", result: "", err: <nil>
      2024-04-20T01:25:33.997542623+00:00 stderr F E0420 01:25:33.995883    7683 reflector.go:148] k8s.io/client-go/informers/factory.go:150: Failed to watch *v1.Pod: the server has asked for the client to provide credentials (get pods)
      

      Expected results:

      Multus detects that cert is expired, requests new certs via CSR flow and reloads them
      

      Additional info:

      CI periodic to check this flow: https://prow.ci.openshift.org/job-history/gs/test-platform-results/logs/periodic-ci-openshift-release-master-nightly-4.16-e2e-metal-ovn-sno-cert-rotation-suspend-30d
      artifacts contain sosreport
      
      Applies to SNO and HA clusters, works as expected when nodes are being properly shutdown instead of suspended

            tohayash@redhat.com Tomofumi Hayashi
            vrutkovs@redhat.com Vadim Rutkovsky
            Ke Wang Ke Wang
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: