Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-12707

Master MCP is degraded because of MC not found

XMLWordPrintable

    • Critical
    • No
    • 0
    • Sprint 239, Sprint 240, Sprint 241, Sprint 242
    • 4
    • Approved
    • False
    • Hide

      this issue is blocking service endpoint related installation

      Show
      this issue is blocking service endpoint related installation
    • Hide
      * Previously, installing a cluster on AWS could fail in some cases due to a validation error. With this update, the installation program produces the necessary cloud configuration object to satisfy the machine config operator. This results in the installation succeeding. (link:https://issues.redhat.com/browse/OCPBUGS-12707[*OCPBUGS-12707*])
      Show
      * Previously, installing a cluster on AWS could fail in some cases due to a validation error. With this update, the installation program produces the necessary cloud configuration object to satisfy the machine config operator. This results in the installation succeeding. (link: https://issues.redhat.com/browse/OCPBUGS-12707 [* OCPBUGS-12707 *])
    • Bug Fix
    • Done

      Description of problem:

      
      When we deploy a cluster in AWS using this template https://gitlab.cee.redhat.com/aosqe/flexy-templates/-/blob/master/functionality-testing/aos-4_14/ipi-on-aws/versioned-installer-customer_vpc-disconnected_private_cluster-sts-private-s3-custom_endpoints-ci master MCP is degraded and reports this error:
      
        - lastTransitionTime: "2023-04-25T07:48:45Z"
          message: 'Node ip-10-0-55-111.us-east-2.compute.internal is reporting: "machineconfig.machineconfiguration.openshift.io
            \"rendered-master-8ef3f9cb45adb7bbe5f819eb831ffd7d\" not found", Node ip-10-0-60-138.us-east-2.compute.internal
            is reporting: "machineconfig.machineconfiguration.openshift.io \"rendered-master-8ef3f9cb45adb7bbe5f819eb831ffd7d\"
            not found", Node ip-10-0-69-137.us-east-2.compute.internal is reporting: "machineconfig.machineconfiguration.openshift.io
            \"rendered-master-8ef3f9cb45adb7bbe5f819eb831ffd7d\" not found"'
          reason: 3 nodes are reporting degraded status on sync
          status: "True"
          type: NodeDegraded
      
      
      

      Version-Release number of selected component (if applicable):

      $ oc get clusterversion
      NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
      version             False       False         3h12m   Error while reconciling 4.14.0-0.nightly-2023-04-19-125337: the cluster operator machine-config is degraded
      
      

      How reproducible:

      2 out of 2.
      
      

      Steps to Reproduce:

      1. Install OCP using this template https://gitlab.cee.redhat.com/aosqe/flexy-templates/-/blob/master/functionality-testing/aos-4_14/ipi-on-aws/versioned-installer-customer_vpc-disconnected_private_cluster-sts-private-s3-custom_endpoints-ci
      
      We can see examples of this installation here:
      https://mastern-jenkins-csb-openshift-qe.apps.ocp-c1.prod.psi.redhat.com/job/ocp-common/job/Flexy-install/198964/
      
      and here:
      https://mastern-jenkins-csb-openshift-qe.apps.ocp-c1.prod.psi.redhat.com/job/ocp-common/job/Flexy-install/199028/
      
      
      Builds have been marked as keep forever, but just in case, the parameters are:
      
      INSTANCE_NAME_PREFIX: Your ID, any short string just make it sure it is unit.
      VARIABLES_LOCATION: private-templates/functionality-testing/aos-4_14/ipi-on-aws/versioned-installer-customer_vpc-disconnected_private_cluster-sts-private-s3-custom_endpoints-ci
      LAUNCHER_VARS: <leave empty>
      BUSHSLICER_CONFIG:<leave emtpy>
      
      

      Actual results:

      
      The installation failed reporting a degrade master MCP
      
      $ oc get clusterversion
      NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
      version             False       False         3h12m   Error while reconciling 4.14.0-0.nightly-2023-04-19-125337: the cluster operator machine-config is degraded
      
      $ oc get mcp
      NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
      master                                                      False     True       True       3              0                   0                     3                      4h21m
      worker   rendered-worker-166729d2617b1b63cf5d9bb818dd9cf8   True      False      False      3              3                   3                     0                      4h21m
      
      
      

      Expected results:

      Installation should finish without problems and no MCP should be degraded
      

      Additional info:

      Must gather linked in the first comment
      

            padillon Patrick Dillon
            sregidor@redhat.com Sergio Regidor de la Rosa
            Sergio Regidor de la Rosa Sergio Regidor de la Rosa
            Votes:
            0 Vote for this issue
            Watchers:
            14 Start watching this issue

              Created:
              Updated:
              Resolved: