Uploaded image for project: 'Container Tools'
  1. Container Tools
  2. RUN-4221

[containers/conmon] conmon 2.2.0 at 99 % CPU

XMLWordPrintable

    • False
    • Hide

      None

      Show
      None
    • False
    • Not Selected
    • rhel-container-tools

      [3922338428] Upstream Reporter: E Bischoff
      Upstream issue status: Closed
      Upstream description:

      Hello everyone,

      We noticed top showing conmon process at 99 % CPU. Global CPU usage on real host:

      %Cpu(s):  7.7 us, 17.9 sy,  0.0 ni, 74.3 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st

      strace shows conmon stuck reading an unavailable resource forever:

      read(11, 0x7ffeaa605fa1, 8192)          = -1 EAGAIN (Resource temporarily unavailable)

      lsof show file descriptor 11 is a pipe:

      conmon  75425 root  11r     FIFO               0,15      0t0   582168 pipe

      namely a pipe to a sshd process:

      uyuni-ci-master-podman-server:~ # lsof -n 2>/dev/null | grep 582168 
      conmon    75425                       root 11r    FIFO              0,15       0t0    582168 pipe 
      sshd      75535                       root  1w    FIFO              0,15       0t0    582168 pipe

      In fact, there are 2 sshd processes:

      uyuni-ci-master-podman-server:~ # ps aux | grep sshd
      root       1986 0.0 0.0 12660 9588 ?       Ss  09:48  0:08 sshd: /usr/sbin/sshd -D [listener] 0 of 100-200 startups
      root      75535 0.0 0.0 18516 9692 ?       S   10:56  0:00 sshd: /usr/sbin/sshd -D [listener] 0 of 10-100 startups

      75535 is some ghost of a sshd that disappeared at some point, 1986 is the real system sshd listening on port 22

      According to the internal server logs, server is being shutdown at that time. Question is why is it not reflected outside as stopped container.

      If we kill the ghost sshd then conmod manages to bail out and restart:

      uyuni-ci-master-podman-server:~ # kill -9 75535 
      uyuni-ci-master-podman-server:~ # strace -f -p 75425 -e trace=all  
      strace: attach: ptrace(PTRACE_SEIZE, 75425): No such process

      The new conmon then behaves normally:

      uyuni-ci-master-podman-server:~ # strace -f -p 7102 -e trace=all  
      strace: Process 7102 attached 
      ppoll([{fd=19, events=POLLIN}, {fd=22, events=POLLIN}, {fd=25, events=POLLIN}, {fd=26, events=POLLIN}, {fd=30, events=POLLIN}], 5, NULL, NULL, 8^C

      and as expected the system is now 100% idle.

      Gemini looked into the specific changes introduced in conmon 2.2.0. and saw you significantly refactored the main event loop to better handle new terminal types and output stream management.

      We reverted to conmon 2.1.0 and the issue disappears so far.


      Upstream URL: https://github.com/containers/conmon/issues/632

              Unassigned Unassigned
              upstream-sync Upstream Sync
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: