Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-106594

crm_mon returns an "Internal software bug" error and drops in performance when executed on a stopped node

Linking RHIVOS CVEs to...Migration: Automation ...SWIFT: POC ConversionSync from "Extern...XMLWordPrintable

    • pacemaker-3.0.1-2.el10
    • Yes
    • Moderate
    • rhel-ha
    • 2
    • False
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • Unspecified
    • Unspecified
    • Unspecified
    • None

      What were you trying to do that didn't work?

      Stop a node and try to run a status.

      What is the impact of this issue to you?

      The increased time needed to execute this command causes a pcsd timeout, which results in the GUI displaying an incorrect node status and prevents users from viewing the node's details.

      Please provide the package NVR for which the bug is seen:

      pacemaker-3.0.1-1.el10

      note: the issue is not present in the previous package: pacemaker-3.0.0-5.el10

      How reproducible is this bug?:

      Always

      Steps to reproduce

      1. Stop a node
      2. Run status command, e.g.:
        crm_mon --one-shot --inactive --output-as=xml

      Expected results

      Message Not connected with almost instant run time.

      # time crm_mon --one-shot --inactive --output-as=xml
      <pacemaker-result api-version="2.38" request="crm_mon --one-shot --inactive --output-as=xml">
        <status code="102" message="Not connected">
          <errors>
            <error>crm_mon: Connection to cluster failed: Connection refused</error>
          </errors>
        </status>
      </pacemaker-result>
      
      real    0m0.018s
      user    0m0.013s
      sys     0m0.005s
      

      Actual results

      Message Internal software bug with significantly increased run time.

      # time crm_mon --one-shot --inactive --output-as=xml
      <pacemaker-result api-version="2.38" request="crm_mon --one-shot --inactive --output-as=xml">
        <status code="70" message="Internal software bug">
          <errors>
            <error>crm_mon: Connection to cluster failed: Invalid argument</error>
          </errors>
        </status>
      </pacemaker-result>
      
      real    0m7.516s
      user    0m0.007s
      sys     0m0.008s
      

       

      Additional info - verbose mode

      pacemaker-3.0.1-1.el10 (where the issue is present)

      [root@hvirt-190 ~]# crm_mon --one-shot --inactive --output-as=xml -VVVVVV
      (set_crm_log_level) 	trace: New log level: 8
      (main) 	info: Starting crm_mon
      (pcmk_new_ipc_api) 	trace: Created launcher API IPC object
      (crm_ipc_connected) 	trace: No connection
      (pcmk__connect_ipc) 	debug: Attempting connection to launcher (up to 5 times)
      (crm_ipc_connected) 	trace: No connection
      (pcmk__connect_ipc) 	debug: Attempting connection to launcher (up to 4 times)
      (crm_ipc_connected) 	trace: No connection
      (pcmk__connect_ipc) 	debug: Attempting connection to launcher (up to 3 times)
      (crm_ipc_connected) 	trace: No connection
      (pcmk__connect_ipc) 	debug: Attempting connection to launcher (up to 2 times)
      (crm_ipc_connected) 	trace: No connection
      (pcmk__connect_ipc) 	debug: Attempting connection to launcher (up to 1 time)
      (pcmk_free_ipc_api) 	debug: Releasing launcher IPC API
      (crm_ipc_destroy) 	trace: Destroying inactive pacemakerd IPC connection
      (ipc_post_disconnect) 	info: Disconnected from launcher
      (pcmk_free_ipc_api) 	trace: Freeing IPC API object
      (cib_native_signoff) 	debug: Disconnecting from the CIB manager
      (cib_client_end_transaction) 	trace: No transaction found for CIB client (unidentified)
      (stonith__api_free) 	trace: Destroying (nil)
      <pacemaker-result api-version="2.38" request="crm_mon --one-shot --inactive --output-as=xml -VVVVVV">
        <status code="70" message="Internal software bug">
          <errors>
            <error>crm_mon: Connection to cluster failed: Invalid argument</error>
          </errors>
        </status>
      </pacemaker-result>
      (crm_exit) 	info: Exiting crm_mon | with status 70 (CRM_EX_SOFTWARE: Internal software bug)
      

      pacemaker-3.0.0-5.el10 (where the issue is not present)

      # crm_mon --one-shot --inactive --output-as=xml -VVVVVV
      (set_crm_log_level)     trace: New log level: 8
      (main)  info: Starting crm_mon
      (pcmk__env_option)      trace: Nothing found for ipc_buffer
      (pcmk__ipc_buffer_size)         debug: Using IPC buffer size 131072 from default (not 0)
      (pcmk_new_ipc_api)      trace: Created launcher API IPC object
      (crm_ipc_connected)     trace: No connection
      (pcmk__connect_ipc)     debug: Attempting connection to launcher (up to 5 times)
      (pcmk_free_ipc_api)     debug: Releasing launcher IPC API
      (crm_ipc_destroy)       trace: Destroying inactive pacemakerd IPC connection
      (ipc_post_disconnect)   info: Disconnected from launcher
      (pcmk_free_ipc_api)     trace: Freeing IPC API object
      (cib_native_signoff)    debug: Disconnecting from the CIB manager
      (cib_client_end_transaction)    trace: No transaction found for CIB client (unidentified)
      (stonith_api_delete)    trace: Destroying (nil)
      <pacemaker-result api-version="2.38" request="crm_mon --one-shot --inactive --output-as=xml -VVVVVV">
        <status code="102" message="Not connected">
          <errors>
            <error>crm_mon: Connection to cluster failed: Connection refused</error>
          </errors>
        </status>
      </pacemaker-result>
      (crm_exit)      info: Exiting crm_mon | with status 102 (CRM_EX_DISCONNECT: Not connected)
      

              rhn-support-clumens Christopher Lumens
              mmazoure Michal Mazourek
              Christopher Lumens Christopher Lumens
              Marketa Smazova Marketa Smazova
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

                Created:
                Updated: