Uploaded image for project: 'Red Hat OpenStack Services on OpenShift'
  1. Red Hat OpenStack Services on OpenShift
  2. OSPRH-23691

Investigate improvement of startup/readiness and livenessprobe for keystone

XMLWordPrintable

      Summary:

      follow up for https://issues.redhat.com/browse/OSPRH-22726.

      during the update the DB pods restart, also the keystone pods do a rolling restart. when running the sanity instance build script `workload_launch.sh sanity` while doing the operator update (not the afterwards service update), e.g. nova and glance has reported 503 as queries to keystone services did not work

      HttpException: 503: Server Error for url: https://nova-public-openstack.apps.ocp.openstack.lab/v2.1/os-services?binary=nova-compute, The server is currently unavailable. Please try again at a later time.<br /><br
      The Keystone service is temporarily unavailable.
      
      • while it is expected that keystone sees a DB error to the instance connected, there might be an issue with the keystone probe settings, or the used probe endpoint as it is just checking the /v3 endpoint url, which does not involve DB queries. With this a new started instance may already respond to be up, while it is still initializing
            livenessProbe:
              failureThreshold: 3
              httpGet:
                path: /v3
                port: 5000
                scheme: HTTPS
              initialDelaySeconds: 5
              periodSeconds: 30
              successThreshold: 1
              timeoutSeconds: 30
            name: keystone-api
            readinessProbe:
              failureThreshold: 3
              httpGet:
                path: /v3
                port: 5000
                scheme: HTTPS
              initialDelaySeconds: 5
              periodSeconds: 30
              successThreshold: 1
              timeoutSeconds: 30
        
      • should test adding a `startupProbe` with an `initialDelaySeconds`, like something
            startupProbe:
              failureThreshold: 6
              httpGet:
                path: /v3
                port: 5000
                scheme: HTTPS
              initialDelaySeconds: 20
              periodSeconds: 10
              successThreshold: 1
              timeoutSeconds: 1
        
      • is there an keystone URL which does not need authentication, but involves the DB which can be used in the startupProbe?

      Goal:

      • does it help to tune the probes for keystone to prevent service outage of keystone during the operator update

      TimeBox:

      • 5 days

      Deliverables/Outcomes:

      • recommendation if improving the probes is a valid solution
      • is it also something we have to do for other operators
      • of do we have to do some more complex operator update procedure and not update all operators at the same time?

              Unassigned Unassigned
              rhn-support-mschuppe Martin Schuppert
              rhos-conplat-core-operators
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated: