Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-1725

Affinity rule created in router deployment for single-replica infrastructure and "NodePortService" endpoint publishing strategy

XMLWordPrintable

    • Moderate
    • Sprint 225, Sprint 226, Sprint 227
    • 3
    • Rejected
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      Cluster ingress operator creates router deployments with affinity rules when running in a cluster with non-HA infrastructure plane (InfrastructureTopology=="SingleReplica") and "NodePortService" endpoint publishing strategy. With only one worker node available, rolling update of router-default stalls.

      Version-Release number of selected component (if applicable):

      All

      How reproducible:

      Create a single worker node cluster with "NodePortService" endpoint publishing strategy and try to restart the default router. Restart will not go through.

      Steps to Reproduce:

      1. Create a single worker node OCP cluster with HA control plane (ControlPlaneTopology=="HighlyAvailable"/"External") and one worker node (InfrastructureTopology=="SingleReplica") using "NodePortService" endpoint publishing strategy. The operator will create "ingress-default" deployment with "podAntiAffinity" block, even though the number of nodes where ingress pods can be scheduled is only one:
      ```
      apiVersion: apps/v1
      kind: Deployment
      metadata:
        ...
        name: router-default
        namespace: openshift-ingress
        ...
      spec:
        ...
        replicas: 1
        ...
        strategy:
          rollingUpdate:
            maxSurge: 25%
            maxUnavailable: 50%
          type: RollingUpdate
        template:
          ...
          spec:
            affinity:
              ...
              podAntiAffinity:
                requiredDuringSchedulingIgnoredDuringExecution:
                - labelSelector:
                    matchExpressions:
                    - key: ingresscontroller.operator.openshift.io/deployment-ingresscontroller
                      operator: In
                      values:
                      - default
                    - key: ingresscontroller.operator.openshift.io/hash
                      operator: In
                      values:
                      - 559d6c97f4
                  topologyKey: kubernetes.io/hostname
      ...
      ```
      
      2. Restart the default router
      
      ```
      oc rollout restart deployment router-default -n openshift-ingress
      ```
       

      Actual results:

      Deployment restart does not complete and hangs forever:
      
      ```
      oc get po -n openshift-ingress
      NAME                              READY   STATUS    RESTARTS   AGE
      router-default-58d88f8bf6-cxnjk   0/1     Pending   0          2s
      router-default-5bb8c8985b-kdg92   1/1     Running   0          2d23h
      ```

      Expected results:

      Deployment restart completes

      Additional info:

       

            mmasters1@redhat.com Miciah Masters
            michael.topchiev@ibm.com Michael Topchiev
            Shudi Li Shudi Li
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: