Uploaded image for project: 'Cloud Enablement'
  1. Cloud Enablement
  2. CLOUD-3229

[OCP 4.1] Pod is not restarted when MP Health returns DOWN or UNDETERMINED

    XMLWordPrintable

Details

    • Bug
    • Resolution: Won't Do
    • Blocker
    • None
    • EAPCD 16.0.GA, EAP72 7.2.1.GA, EAP64 6.4.22.GA
    • CD16, EAP6, EAP7, jboss-eap-modules
    • None

    Description

      Tested Scenario:
      Start deployment with DESIRED_STATE=DOWN (or UNDETERMINED) and register probes. Based on [1] readiness probes when returns "DOWN" in MP Health response restart should occure. This is happening on OCP 3.11 but is not happening on OCP 4.1.

      In EAP log I see

      [0m�[0m06:39:43,195 INFO [org.jboss.xpaas.microprofile.health.TestHealthCheck] (External Management Request Threads -- 2) Health Check called �[0m�[0m06:39:43,198 INFO [org.jboss.xpaas.microprofile.health.TestHealthCheck] (External Management Request Threads -- 1) Health Check called �[0m�[0m06:39:44,165 INFO [org.jboss.xpaas.microprofile.health.TestHealthCheck] (External Management Request Threads -- 1) Health Check called �[0m�[0m06:39:46,673 INFO [org.jboss.xpaas.microprofile.health.TestHealthCheck] (External Management Request Threads -- 1) Health Check called �[0m�[0m06:39:46,673 INFO [org.jboss.xpaas.microprofile.health.TestHealthCheck] (External Management Request Threads -- 2) Health Check called �[0m�[0m06:39:51,589 INFO [org.jboss.xpaas.microprofile.health.TestHealthCheck] (External Management Request Threads -- 1) Health Check called �[0m�[0m06:40:01,587 INFO [org.jboss.xpaas.microprofile.health.TestHealthCheck] (External Management Request Threads -- 1) Health Check called �[0m�[0m06:40:11,586 INFO [org.jboss.xpaas.microprofile.health.TestHealthCheck] (External Management Request Threads -- 1) Health Check called �[0m�[0m06:40:21,586 INFO [org.jboss.xpaas.microprofile.health.TestHealthCheck] (External Management Request Threads -- 1) Health Check called 
      ....
      

      In OCP web console I see Events

      - Readiness probe failed: { "probe.eap.dmr.EapProbe": { "probe.eap.dmr.ServerStatusTest": "running", "probe.eap.dmr.DeploymentTest": { "ROOT.war": "OK" }, "probe.eap.dmr.BootErrorsTest": "No boot errors" }, "probe.eap.dmr.HealthCheckProbe": { "probe.eap.dmr.HealthCheckTest": "Status is DOWN" } } 
      
      - Liveness probe errored: rpc error: code = Unknown desc = command error: command timed out, stdout: , stderr: , exit code -1
      

      Status of pod is Running but Readiness is ContainersNotReady

      When I try probes from terminal I see expected result

      sh-4.2$ /opt/eap/bin/livenessProbe.sh
      {
          "probe.eap.dmr.EapProbe": {
              "probe.eap.dmr.ServerStatusTest": "running",
              "probe.eap.dmr.DeploymentTest": {
                  "ROOT.war": "OK"
              },
              "probe.eap.dmr.BootErrorsTest": "No boot errors"
          },
          "probe.eap.dmr.HealthCheckProbe": {
              "probe.eap.dmr.HealthCheckTest": "Status is DOWN"
          }
      }
      sh-4.2$ /opt/eap/bin/readinessProbe.sh
      {
          "probe.eap.dmr.EapProbe": {
              "probe.eap.dmr.ServerStatusTest": "running",
              "probe.eap.dmr.DeploymentTest": {
                  "ROOT.war": "OK"
              },
              "probe.eap.dmr.BootErrorsTest": "No boot errors"
          },
          "probe.eap.dmr.HealthCheckProbe": {
              "probe.eap.dmr.HealthCheckTest": "Status is DOWN"
          }
      }
      sh-4.2$
      

      [1] https://issues.jboss.org/browse/CLOUD-2730?focusedCommentId=13616667&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13616667

      Attachments

        Activity

          People

            kwills@redhat.com Ken Wills
            kwills@redhat.com Ken Wills
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: