Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-1406

SNO with DU profile becomes unresponsive when launching test pods

    XMLWordPrintable

Details

    • Critical
    • Proposed
    • False
    • Hide

      None

      Show
      None

    Description

      Description of problem:

      SNO with DU profile becomes unresponsive when launching test pods

      Version-Release number of selected component (if applicable):

      4.9.48

      How reproducible:

      Not always but can be reproduced consistently

      Steps to Reproduce:

      1. Deploy SNO with DU configuration applied
      2. Launch eDU test app
      3. Delete and re-create test app

      Actual results:

      The API becomes unresponsive and the node load jumps to ~160

      Expected results:

      Creating test apps does not impact the platform.

      Additional info:

      kernel 4.18.0-305.62.1.rt7.134.el8_4.x86_64
      
      Attaching must gather.
      
      Snippet from top output:
      
      top - 08:35:41 up 59 min,  1 user,  load average: 165.35, 137.42, 112.18
      Tasks: 1962 total,   6 running, 1950 sleeping,   0 stopped,   6 zombie
      %Cpu(s):  5.2 us,  3.9 sy,  0.0 ni, 90.6 id,  0.0 wa,  0.0 hi,  0.2 si,  0.0 st
      MiB Mem :  96057.1 total,  42204.4 free,  45858.6 used,   7994.1 buff/cache
      MiB Swap:      0.0 total,      0.0 free,      0.0 used.  49459.0 avail Mem     PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
      1167618 root      20   0 1511916 122208  46476 S  58.3   0.1   0:03.36 authentication-
       207236 root      20   0 1210872 252252  46068 S  47.2   0.3  14:03.05 fluentd
      1125873 root      20   0 2360928   1.5g  85892 S  44.4   1.6   2:19.76 kube-apiserver
         4052 root      20   0   14.6g 490660  72628 S  19.4   0.5  18:41.46 kubelet
      1093908 root      20   0   10.2g 691076 176624 S  19.4   0.7   0:28.70 etcd
      1251178 root      20   0   69064   7928   5068 R  19.4   0.0   0:00.14 top
      1251176 root      20   0 1240228  16668   9456 S  16.7   0.0   0:00.06 runc
       236821 root      20   0 1500616  49888  33076 S  13.9   0.1   0:15.69 diskmaker
        38528 nfsnobo+  20   0 4014896   2.1g 198876 S  11.1   2.3   6:36.42 prometheus
      1141984 root      20   0 1441408  82956  42912 S  11.1   0.1   0:02.81 cluster-openshi
      1125268 root      20   0  967752 273156  70128 S   8.3   0.3   0:13.59 kube-controller
           21 root     -12   0       0      0      0 S   5.6   0.0   1:52.61 ksoftirqd/1
          208 root     -12   0       0      0      0 S   5.6   0.0   1:57.47 ksoftirqd/24
       189127 1000670+  20   0 1791156 106452  40824 S   5.6   0.1   0:24.48 registration-op
       243859 1000440+  20   0  753572  71420  35544 S   5.6   0.1   0:17.21 adapter
       292079 root      20   0 3462052 210628  68456 S   5.6   0.2   2:20.75 openshift-apise
      1129065 root      20   0 1443596  93252  44668 S   5.6   0.1   0:02.91 cluster-kube-co
      1129132 root      20   0 1445308  98432  45692 S   5.6   0.1   0:04.18 cluster-etcd-op
      1220859 1000680+  20   0  734960  33608  20920 S   5.6   0.0   0:00.42 governance-poli
      

      Attachments

        Activity

          People

            mcornea@redhat.com Marius Cornea
            mcornea@redhat.com Marius Cornea
            Marius Cornea Marius Cornea
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: