Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-1360

openshift-controller-manager is not coming up and there are no logs to indicate why

XMLWordPrintable

    • False
    • Hide

      None

      Show
      None

      Description of problem:

      Problem was found during the triage of tickets 
      
      https://issues.redhat.com/browse/AITRIAGE-4003
      https://issues.redhat.com/browse/AITRIAGE-3996
      
      Common factors between these two tickets are that they are from the same customer and the cluster is installed with the following features
      
      Configured features: OVN network type, Proxy, Requested hostname, SNO
      
      In both cases 
      
      The cluster operator "openshift-controller-manager" fails to come up and when the pod logs are queried, there is no log content.
      
      I have given some more details in the "reproduction" section below.
      
      

      Software version: OpenShift 4.11

      Please see the description above, I am unsure of what the exact triggers of this issue are as we don't have enough data to say with certainty.
      
      

      Please see the description above, I am unsure of what the exact triggers of this issue are as we don't have enough data to say with certainty.

      Actual results:

      The cluster operator "openshift-controller-manager" fails to come up and when the pod logs are queried, there is no log content.

      Looking at the pod

      omg get pods -n openshift-controller-manager
      NAME                      READY  STATUS   RESTARTS  AGE
      controller-manager-fwkz7  0/1    Pending  0         1h28m
      

      The YAML of the pod contains the following error...

      containerStatuses:
        - image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:fead5da1351abee9e77f5c46247c7947d6f9cf76b5bf7a8b4905222fe2564665
          imageID: ''
          lastState: {}
          name: controller-manager
          ready: false
          restartCount: 0
          started: false
          state:
            terminated:
              exitCode: 137
              finishedAt: null
              message: The container could not be located when the pod was terminated
              reason: ContainerStatusUnknown
              startedAt: null
      

      An attempt to acquire pod logs reveals that there are no logs available for the pod. So it's hard to determine why the pod would not stay up.

      The cluster operator "openshift-controller-manager" should come up and stay up
      
      

      Additional info: I have attached the must gather from the AITRIAGE tickets so that further investigation may take place

        1. AITRIAGE-4003-must-gather.tar.gz
          10.17 MB
          Paul Maidment
        2. AITRIAGE-3996-must-gather.tar.gz
          10.06 MB
          Paul Maidment

            joelsmith.redhat Joel Smith
            pmaidmen Paul Maidment
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: