Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-17769

Agent-based install process the container machine-config-controller will be oom

XMLWordPrintable

    • Critical
    • No
    • MCO Sprint 241
    • 1
    • Proposed
    • False
    • Hide

      None

      Show
      None

      This is a clone of issue OCPBUGS-17568. The following is the description of the original issue:

      Description of problem:

       

      Customer used Agent-based installer to install 4.13.8 on they CID env, but during install process, the bootstrap machine had oom issue, check sosreport find the init container had oom issue

      NOTE: Issue is not see when testing with 4.13.6, per the customer

      initContainers:

      • name: machine-config-controller
        image: .Images.MachineConfigOperator
        command: ["/usr/bin/machine-config-controller"]
        args:
      • "bootstrap"
      • "--manifest-dir=/etc/mcc/bootstrap"
      • "--dest-dir=/etc/mcs/bootstrap"
      • "--pull-secret=/etc/mcc/bootstrap/machineconfigcontroller-pull-secret"
      • "--payload-version=.ReleaseVersion"
        resources:
        limits:
        memory: 50Mi

      we found the sosreport dmesg and crio logs had oom kill machine-config-controller container issue, the issue was cause by cgroup kill, so looks like the limit 50M is too small

      The customer used a physical machine that had 100GB of memory

      the customer had some network config in asstant install yaml file, maybe the issue is them had some nic config?

      log files:
      1. sosreport
      https://attachments.access.redhat.com/hydra/rest/cases/03578865/attachments/b5501734-60be-4de4-adcf-da57e22cbb8e?usePresignedUrl=true

      2. asstent installer yaml file
      https://attachments.access.redhat.com/hydra/rest/cases/03578865/attachments/a32635cf-112d-49ed-828c-4501e95a0e7a?usePresignedUrl=true

      3. bootstrap machine oom screenshot
      https://attachments.access.redhat.com/hydra/rest/cases/03578865/attachments/eefe2e57-cd23-4abd-9e0b-dd45f20a34d2?usePresignedUrl=true

              djoshy David Joshy
              openshift-crt-jira-prow OpenShift Prow Bot
              Sergio Regidor de la Rosa Sergio Regidor de la Rosa
              OCP-MCO
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Created:
                Updated:
                Resolved: