Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-17682

topologySpreadConstraints for UWM prometheus-operator does not work

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done-Errata
    • Icon: Undefined Undefined
    • 4.15.0
    • 4.14.0
    • Monitoring
    • Moderate
    • No
    • MON Sprint 240
    • 1
    • False
    • Hide

      None

      Show
      None
    • .
    • Release Note Not Required
    • Done

      Description of problem:

      since in-cluster prometheus-operator and UWM prometheus-operator pods are scheduled to master nodes, see from

      https://github.com/openshift/cluster-monitoring-operator/blob/release-4.14/assets/prometheus-operator/deployment.yaml#L88-L97

      https://github.com/openshift/cluster-monitoring-operator/blob/release-4.14/assets/prometheus-operator-user-workload/deployment.yaml#L91-L103

      enabled UWM and add topologySpreadConstraints for in-cluster prometheus-operator and UWM prometheus-operator(set topologyKey to node-role.kubernetes.io/master), topologySpreadConstraints takes effect for in-cluster prometheus-operator, but not for UWM prometheus-operator

      apiVersion: v1
      data:
        config.yaml: |
          enableUserWorkload: true
          prometheusOperator:
            topologySpreadConstraints:
            - maxSkew: 1
              topologyKey: node-role.kubernetes.io/master
              whenUnsatisfiable: DoNotSchedule
              labelSelector:
                matchLabels:
                  app.kubernetes.io/name: prometheus-operator
      kind: ConfigMap
      metadata:
        name: cluster-monitoring-config
        namespace: openshift-monitoring
      

      in-cluster prometheus-operator, topologySpreadConstraints settings are loaded to prometheus-operator pod and deployment, see

      $ oc -n openshift-monitoring get deploy prometheus-operator -oyaml | grep topologySpreadConstraints -A7
            topologySpreadConstraints:
            - labelSelector:
                matchLabels:
                  app.kubernetes.io/name: prometheus-operator
              maxSkew: 1
              topologyKey: node-role.kubernetes.io/master
              whenUnsatisfiable: DoNotSchedule
            volumes:
      
      $ oc -n openshift-monitoring get pod -l app.kubernetes.io/name=prometheus-operator -o wide
      NAME                                   READY   STATUS    RESTARTS   AGE    IP            NODE                                                 NOMINATED NODE   READINESS GATES
      prometheus-operator-65496d5b78-fb9nq   2/2     Running   0          105s   10.128.0.71   juzhao-0813-szb9h-master-0.c.openshift-qe.internal   <none>           <none>
      
      $ oc -n openshift-monitoring get pod prometheus-operator-65496d5b78-fb9nq -oyaml | grep topologySpreadConstraints -A7
          topologySpreadConstraints:
          - labelSelector:
              matchLabels:
                app.kubernetes.io/name: prometheus-operator
            maxSkew: 1
            topologyKey: node-role.kubernetes.io/master
            whenUnsatisfiable: DoNotSchedule
          volumes: 

      but the topologySpreadConstraints settings are not loaded to UWM prometheus-operator pod and deployment

      $ oc -n openshift-user-workload-monitoring get cm user-workload-monitoring-config -oyaml
      apiVersion: v1
      data:
        config.yaml: |
          prometheusOperator:
            topologySpreadConstraints:
            - maxSkew: 1
              topologyKey: node-role.kubernetes.io/master
              whenUnsatisfiable: DoNotSchedule
              labelSelector:
                matchLabels:
                  app.kubernetes.io/name: prometheus-operator
      kind: ConfigMap
      metadata:
        creationTimestamp: "2023-08-14T08:10:49Z"
        labels:
          app.kubernetes.io/managed-by: cluster-monitoring-operator
          app.kubernetes.io/part-of: openshift-monitoring
        name: user-workload-monitoring-config
        namespace: openshift-user-workload-monitoring
        resourceVersion: "212490"
        uid: 048f91cb-4da6-4b1b-9e1f-c769096ab88c
      
      $ oc -n openshift-user-workload-monitoring get deploy prometheus-operator -oyaml | grep topologySpreadConstraints -A7
      no result
      
      $ oc -n openshift-user-workload-monitoring get pod -l app.kubernetes.io/name=prometheus-operator
      NAME                                   READY   STATUS    RESTARTS   AGE
      prometheus-operator-77bcdcbd9c-m5x8z   2/2     Running   0          15m
      
      $ oc -n openshift-user-workload-monitoring get pod prometheus-operator-77bcdcbd9c-m5x8z -oyaml | grep topologySpreadConstraints
      no result 

      Version-Release number of selected component (if applicable):

      4.14.0-0.nightly-2023-08-11-055332

      How reproducible:

      always

      Steps to Reproduce:

      1. see the description
      2.
      3.
      

      Actual results:

      topologySpreadConstraints settings are not loaded to UWM prometheus-operator pod and deployment

      Expected results:

      topologySpreadConstraints settings loaded to UWM prometheus-operator pod and deployment

            mariofer@redhat.com Mario Fernandez Herrero
            juzhao@redhat.com Junqi Zhao
            Junqi Zhao Junqi Zhao
            Brian Burt Brian Burt
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: