Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-17788

OpenShift Container Platform 4.13.4 installation is failing because of rendered-master-${hash} not found

XMLWordPrintable

    • Moderate
    • No
    • MCO Sprint 247, MCO Sprint 248
    • 2
    • Rejected
    • False
    • Hide

      None

      Show
      None
    • Hide
      * Previously, when the `machine config not found` error was reported, there was not enough information to troubleshoot and correct the problem. With this release, an alert and metric have been added to the Machine Config Operator. As a result, you have more information to troubleshoot and remediate the `machine config not found` error. (link:https://issues.redhat.com/browse/OCPBUGS-17788[*OCPBUGS-17788*])
      Show
      * Previously, when the `machine config not found` error was reported, there was not enough information to troubleshoot and correct the problem. With this release, an alert and metric have been added to the Machine Config Operator. As a result, you have more information to troubleshoot and remediate the `machine config not found` error. (link: https://issues.redhat.com/browse/OCPBUGS-17788 [* OCPBUGS-17788 *])
    • Bug Fix
    • Done

      Description of problem:

      The installation of OpenShift Container Platform 4.13.4 is failing fairly frequent compare to previous version, when installing with proxy configured.
      
      The error reported by the MachineConfigPool is as shown below.
      
        - lastTransitionTime: "2023-07-04T10:36:44Z"
          message: 'Node master0.example.com is reporting: "machineconfig.machineconfiguration.openshift.io
            \"rendered-master-1e13d7d4ca10669d3d5a6a2bd532873a\" not found", Node master1.example.com
            is reporting: "machineconfig.machineconfiguration.openshift.io \"rendered-master-1e13d7d4ca10669d3d5a6a2bd532873a\"
            not found", Node master2.example.com is reporting:
            "machineconfig.machineconfiguration.openshift.io \"rendered-master-1e13d7d4ca10669d3d5a6a2bd532873a\"
            not found"'
      
      According to https://docs.google.com/document/d/1fgP6Kv1D-75e1Ot0Kg-W2qPyxWDp2_CALltlBLuseec/edit#heading=h.ny6l9ud82fxx this seems to be a known condition but it's not clear how to prevent that from happening and therefore ensure installation are working as expected.
      
      The major difference found between /etc/mcs-machine-config-content.json on the OpenShift Container Platform 4 - Control-Plane Node and the rendered-master-${hash} are within the following files.
      
       - /etc/mco/proxy.env
       - /etc/kubernetes/kubelet-ca.crt
      

      Version-Release number of selected component (if applicable):

      OpenShift Container Platform 4.13.4
      

      How reproducible:

      Random
      

      Steps to Reproduce:

      1. Install OpenShift Container Platform 4.13.4 on AWS with platform:none, proxy defined and both machineCIDR and machineNetwork.cidr set.
      

      Actual results:

      Installation is stuck and will eventually fail as the MachineConfigPool is failing to rollout required MachineConfig for master MachineConfigPool
      
        - lastTransitionTime: "2023-07-04T10:36:44Z"
          message: 'Node master0.example.com is reporting: "machineconfig.machineconfiguration.openshift.io
            \"rendered-master-1e13d7d4ca10669d3d5a6a2bd532873a\" not found", Node master1.example.com
            is reporting: "machineconfig.machineconfiguration.openshift.io \"rendered-master-1e13d7d4ca10669d3d5a6a2bd532873a\"
            not found", Node master2.example.com is reporting:
            "machineconfig.machineconfiguration.openshift.io \"rendered-master-1e13d7d4ca10669d3d5a6a2bd532873a\"
            not found"'
      

      Expected results:

      Installation to work or else provide meaningful error messaging 
      

      Additional info:

      https://docs.google.com/document/d/1fgP6Kv1D-75e1Ot0Kg-W2qPyxWDp2_CALltlBLuseec/edit#heading=h.ny6l9ud82fxx checked and then talked to Red Hat Engineering as it was not clear how to proceed
      
      

              cdoern@redhat.com Charles Doern
              rhn-support-sreber Simon Reber
              Sergio Regidor de la Rosa Sergio Regidor de la Rosa
              Shauna Diaz Shauna Diaz
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

                Created:
                Updated:
                Resolved: