Uploaded image for project: 'Red Hat Advanced Cluster Management'
  1. Red Hat Advanced Cluster Management
  2. ACM-6550

MCE in Error Phase after upgrade to ACM 2.6.6 MCE 2.1.7

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Major Major
    • ACM 2.9.0
    • ACM 2.9.0
    • Installer
    • False
    • None
    • False
    • Installer Sprint 23-14
    • Critical
    • Customer Escalated
    • No

      Description of problem:

      MCE is in an Error phase after what appears to have been an upgrade to ACM 2.6.6 and MCE 2.1.7. MCE resource is reporting:

      rpc error: code = Unknown desc = malformed header: missing HTTP content-type

      since July 14th. On July 24th the MCE operator was re-installed following the steps at https://access.redhat.com/solutions/6459071 but MCE has not recovered. Looking at the MCE operator pod logs we see repeating stream errors such as:

      2023-07-24T17:36:58.288825855Z 1.6902202182887614e+09 DEBUG controller-runtime.webhook.webhooks wrote response {"webhook": "/validate-multicluster-openshift-io-v1-multiclusterengine", "code": 200, "reason": "", "UID": "beee0c21-e368-465a-99d9-b7b8da16b1be", "allowed": true}
      2023-07-24T17:37:01.897230172Z W0724 17:37:01.897170 1 reflector.go:324] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:262: failed to list *v1.ConfigMap: stream error when reading response body, may be caused by closed connection. Please retry. Original error: stream error: stream ID 1463; INTERNAL_ERROR; received from peer
      2023-07-24T17:37:01.897300566Z I0724 17:37:01.897233 1 trace.go:205] Trace[590526907]: "Reflector ListAndWatch" name:sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:262 (24-Jul-2023 17:36:00.718) (total time: 61178ms):
      2023-07-24T17:37:01.897300566Z Trace[590526907]: ---"Objects listed" error:stream error when reading response body, may be caused by closed connection. Please retry. Original error: stream error: stream ID 1463; INTERNAL_ERROR; received from peer 61178ms (17:37:01.897)
      2023-07-24T17:37:01.897300566Z Trace[590526907]: [1m1.178279099s] [1m1.178279099s] END
      2023-07-24T17:37:01.897300566Z E0724 17:37:01.897251 1 reflector.go:138] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:262: Failed to watch *v1.ConfigMap: failed to list *v1.ConfigMap: stream error when reading response body, may be caused by closed connection. Please retry. Original error: stream error: stream ID 1463; INTERNAL_ERROR; received from peer

      The shift support team investigated due to concerns over the cluster health and reported:

      etcd response rate for the cluster is pretty bad, and they seem to have an issue with volumes for ODF, but I don't see any signs of an issue with cluster health. All nodes are ready, minimum specs met, MCP up to date, all pods Ready or Completed.

      Version-Release number of selected component (if applicable):

      ACM 2.6.6 / MCE 2.1.7

      How reproducible:

      Have not seen in lab

      Steps to Reproduce:

      1.  
      2.  
      3. ...

      Actual results:

      Expected results:

      Additional info:

              jagray@redhat.com Jakob Gray
              rhn-support-jayoung James Young
              Ting Xue Ting Xue
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated:
                Resolved: