Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-38078

Abnormal values for 'router.openshift.io/haproxy.health.check.interval' annotation breaks the router-default pods

XMLWordPrintable

    • Moderate
    • None
    • 3
    • NE Sprint 257, NE Sprint 258, NE Sprint 259, NE Sprint 260, NE Sprint 261, NE Sprint 262, NE Sprint 263, NE Sprint 264
    • 8
    • Rejected
    • False
    • Hide

      None

      Show
      None
    • Hide
      *Cause*: No out of bounds validation for the router.openshift.io/haproxy.health.check.interval annotation allows to set its value to one exceeding the maximum handled by HAProxy.
      *Consequence*: The timer overflow alert messages are reported and the router-default pod never reaches the ready state.
      *Fix*: Validate router.openshift.io/haproxy.health.check.interval annotation value to ensure it is within the range that HAProxy can parse, effectively capping the value at 2147483647 ms (~24.8 days).
      *Result*: router.openshift.io/haproxy.health.check.interval annotation is set to a value that can be parsed by HAProxy.
      Show
      *Cause*: No out of bounds validation for the router.openshift.io/haproxy.health.check.interval annotation allows to set its value to one exceeding the maximum handled by HAProxy. *Consequence*: The timer overflow alert messages are reported and the router-default pod never reaches the ready state. *Fix*: Validate router.openshift.io/haproxy.health.check.interval annotation value to ensure it is within the range that HAProxy can parse, effectively capping the value at 2147483647 ms (~24.8 days). *Result*: router.openshift.io/haproxy.health.check.interval annotation is set to a value that can be parsed by HAProxy.
    • Bug Fix
    • In Progress

      Description of problem:

      There is no clipValue function for the annotation router.openshift.io/haproxy.health.check.interval. Once any value with abnormal values, the router-default starts to report the following messages:
      
      [ALERT]    (50) : config : [/var/lib/haproxy/conf/haproxy.config:13791] : 'server be_secure:xxx:httpd-gateway-route/pod:xxx:xxx-gateway-service:pass-through-https:10.129.xx.xx:8243' : timer overflow in argument <50000d> to <inter> of server pod:xxx:xxx:pass-through-https:10.129.xx.xx:8243, maximum value is 2147483647 ms (~24.8 days)..
      
      In the above case, the value 50000d was passed to the route annotation router.openshift.io/haproxy.health.check.interval accidentally
      
      

      Version-Release number of selected component (if applicable):

          

      How reproducible:

      Easily

      Steps to Reproduce:

      1. Run the following script and this will break the cluster
      
      oc get routes -A | awk '{print $1 " " $2}' | tail -n+2 | while read line; do    
       read -r namespace routename <<<$(echo $line)   echo -n "NS: $namespace | "   echo "ROUTENAME: $routename"   
       CMD="oc annotate route -n $namespace $routename --overwrite router.openshift.io/haproxy.health.check.interval=50000d"   
       echo "Annotating route with:"   
       echo $CMD ; eval "$CMD"  
       echo "---" 
      done     

      Actual results:

          The alert messages are reported and the router-default pod never reaches the ready state.

      Expected results:

          Clip the value in order to prevent the issue

      Additional info:

          

              rh-ee-gpiotrow Grzegorz Piotrowski
              rhn-support-bgomes Bruno Gomes
              Ishmam Amin Ishmam Amin
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

                Created:
                Updated: