Uploaded image for project: 'Red Hat OpenStack Services on OpenShift'
  1. Red Hat OpenStack Services on OpenShift
  2. OSPRH-13633

BZ#2290861 rabbit-mq ssl config caused pacemaker rabbitmq resource to fail monitoring and controller node to get fenced

XMLWordPrintable

    • 8
    • False
    • Hide

      None

      Show
      None
    • False
    • puppet-tripleo-14.2.3-17.1.20241216110839.40278e1.el9ost openstack-tripleo-heat-templates-14.3.1-17.1.20241216110839.e7c7ce3.el9osttrunk
    • None
    • PIDONE 18.0.5, PIDONE 18.0.6, PIDONE 18.0.7
    • 3
    • Important

      Description of problem:
      CU experienced several fencing at pacemaker controller nodes due to failure on monitoring rabbit-mq cluster resource. the following get logged during the fault:

      ~~~
      2024-05-16 00:14:37.945764+02:00 [error] <0.9405.0> ** Node 'rabbit@controllernodeNN.example.domain.com' not responding **
      2024-05-16 00:14:37.945764+02:00 [error] <0.9405.0> ** Removing (timedout) connection **
      2024-05-16 00:14:37.945764+02:00 [error] <0.9405.0>
      2024-05-16 00:14:37.945976+02:00 [notice] <0.9404.0> TLS server: In state connection at tls_connection_1_3.erl:633 generated SERVER ALERT: Fatal - Internal Error
      2024-05-16 00:14:37.945976+02:00 [notice] <0.9404.0> - closed
      ~~~

      it turned out that disabling SSL at rabbitmq by changing rabbitmq-env.conf in the following way solved the issue:

      from:
      RABBITMQ_CTL_ERL_ARGS="+sbwt none +sbwtdcpu none +sbwtdio none -ssl_dist_optfile /etc/rabbitmq/ssl-dist.conf -crypto fips_mode false -pa /usr/lib64/erlang/lib/ssl-10.7.3.2/ebin -proto_dist inet_tls"
      RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS="+sbwt none +sbwtdcpu none +sbwtdio none -ssl_dist_optfile /etc/rabbitmq/ssl-dist.conf -crypto fips_mode false -pa /usr/lib64/erlang/lib/ssl-10.7.3.2/ebin -proto_dist inet_tls"

      to:
      RABBITMQ_CTL_ERL_ARGS="+sbwt none +sbwtdcpu none +sbwtdio none"
      RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS="+sbwt none +sbwtdcpu none +sbwtdio none"

      Version-Release number of selected component (if applicable):
      OSP 17.1.2

              dabarzil Daniel Barzilay
              jira-bugzilla-migration RH Bugzilla Integration
              Joe Hakim Rahme Joe Hakim Rahme
              rhos-dfg-pidone
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Created:
                Updated:
                Resolved: