Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-18344

machine-os-builder reports an error and is stuck when a MC is rendered before the previous MC has been completely built

XMLWordPrintable

    • Important
    • No
    • MCO Sprint 250, MCO Sprint 251, MCO Sprint 252
    • 3
    • Rejected
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      
      In MCPs that are using on-cluster builds, when we create a MC and before the resulting rendered configuration is built (the build pod is still running) we create another new MC, then the machine-os-builder pod is stuck reporting this error:
      
      
      {noformat}
      I0830 15:17:38.094296       1 image_build_controller.go:129] Started syncing pod openshift-machine-config-operator/build-rendered-worker-c8522984c224de584cbca2a95d584066
      I0830 15:17:38.113944       1 build_controller.go:370] Build (build-rendered-worker-c8522984c224de584cbca2a95d584066) is Complete
      I0830 15:17:38.113974       1 build_controller.go:652] Build succeeded for MachineConfigPool worker, config rendered-worker-516e5f95aeb41fce8231b00878350eaa
      I0830 15:17:38.216381       1 request.go:628] Waited for 102.315143ms due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/apis/build.openshift.io/v1/namespaces/openshift-machine-config-operator/builds/build-rendered-worker-516e5f95aeb41fce8231b00878350eaa
      I0830 15:17:38.223007       1 image_build_controller.go:127] Finished syncing pod openshift-machine-config-operator/build-rendered-worker-c8522984c224de584cbca2a95d584066: 128.712599ms
      E0830 15:17:38.223038       1 image_build_controller.go:305] unable to update with build status: could not get final image pullspec for pool worker: could not get build build-rendered-worker-516e5f95aeb41fce8231b00878350eaa for pool worker: builds.build.openshift.io "build-rendered-worker-516e5f95aeb41fce8231b00878350eaa" not found
      I0830 15:17:38.223047       1 image_build_controller.go:306] Dropping build "openshift-machine-config-operator/build-rendered-worker-c8522984c224de584cbca2a95d584066" out of the queue: unable to update with build status: could not get final image pullspec for pool worker: could not get build build-rendered-worker-516e5f95aeb41fce8231b00878350eaa for pool worker: builds.build.openshift.io "build-rendered-worker-516e5f95aeb41fce8231b00878350eaa" not found
      {noformat}
      
      
      
      
      

      Version-Release number of selected component (if applicable):

      4.14
      

      How reproducible:

      Always
      

      Steps to Reproduce:

      We reproduce it by executing the automated test case 
      
      "[sig-mco] MCO Author:sregidor-NonHyperShiftHOST-NonPreRelease-Longduration-Medium-63477-Deploy files using all available ignition configs. Default 3.4.0[Disruptive] [Serial]"
      
      
      But I guess that we can reproduce it manually by
      
      1. Enable on-cluster build in worker pool
      
      $ oc label mcp/worker machineconfiguration.openshift.io/layering-enabled=
      
      2. Create a MC to deploy a new test file
      piVersion: machineconfiguration.openshift.io/v1
      kind: MachineConfig
      metadata:
        labels:
          machineconfiguration.openshift.io/role: worker
        name: mc-tc-63477-2-2-0-svgix23t
      spec:
        config:
          ignition:
            version: 2.2.0
          passwd:
            users: []
          storage:
            files:
            - contents:
                source: data:text/plain;charset=utf-8;base64,Mi4yLjAgdGVzdCBmaWxl
              filesystem: root
              mode: 420
              path: /etc/2-2-0.test
      
      
      
      3. After 3 or 5 secods create a new MC to deploy another file
      
      
      apiVersion: machineconfiguration.openshift.io/v1
      kind: MachineConfig
      metadata:
        labels:
          machineconfiguration.openshift.io/role: worker
        name: mc-tc-63477-3-0-0-17oe4k8a
      spec:
        config:
          ignition:
            version: 3.0.0
          passwd:
            users: []
          storage:
            files:
            - contents:
                source: data:text/plain;charset=utf-8;base64,My4wLjAgdGVzdCBmaWxl
              mode: 420
              path: /etc/3-0-0.test
      
      4. Continue creating new MCs until you see this error in the machine-os-builder pod
      
      I0830 15:17:38.223047       1 image_build_controller.go:306] Dropping build "openshift-machine-config-operator/build-rendered-worker-c8522984c224de584cbca2a95d584066" out of the queue: unable to update with build status: could not get final image pullspec for pool worker: could not get build build-rendered-worker-516e5f95aeb41fce8231b00878350eaa for pool worker: builds.build.openshift.io "build-rendered-worker-516e5f95aeb41fce8231b00878350eaa" not found
      
      

      Actual results:

      
      The machine-os-builder pod reports an error and it gets stuck.
      
      

      Expected results:

      The machine-os-builder pod should not be stuck in any error.
      
      

      Additional info:

      A link to the must gather file is provided in the first comment of this issue.
      

            team-mco Team MCO
            sregidor@redhat.com Sergio Regidor de la Rosa
            Sergio Regidor de la Rosa Sergio Regidor de la Rosa
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated: