Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-13839

The Ansible Automation Platform Operator automation Job pods are not evenly ditributing over the worker nodes

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Not a Bug
    • Icon: Critical Critical
    • None
    • 4.10.z
    • kube-scheduler
    • None
    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • Critical
    • No
    • None
    • None
    • Rejected
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

      CU has AAP operator installed in OCP cluster 4.10.51
      When running AUTOMATION JOBs the pods are only getting scheduled to 3 nodes and not evenly distributing to other worker nodes even though they have other worker nodes with low utilization

      Version-Release number of selected component (if applicable):

      OCP version : 4.10.51

      How reproducible:

      Step 1 : Installed AAP operator v2.2 in OCP cluster version 4.10.51
      Step2  : Ran the sample job multiple times and could see that the automation jobs are only getting scheduled to one worker node

      Additional Info about the cluster :

       Cluster specific Details:
      
      - using the default LowNodeUtilization profile
      spec:
        mastersSchedulable: false
        policy:
          name: ""
      status: {}
      
      $ oc get nodes
      NAME                                                   STATUS   ROLES    AGE    VERSION
      master-0.sharedocp4upi410ovn.lab.psi.pnq2.redhat.com   Ready    master   6d7h   v1.23.12+8a6bfe4
      master-1.sharedocp4upi410ovn.lab.psi.pnq2.redhat.com   Ready    master   6d7h   v1.23.12+8a6bfe4
      master-2.sharedocp4upi410ovn.lab.psi.pnq2.redhat.com   Ready    master   6d7h   v1.23.12+8a6bfe4
      worker-0.sharedocp4upi410ovn.lab.psi.pnq2.redhat.com   Ready    worker   6d6h   v1.23.12+8a6bfe4
      worker-1.sharedocp4upi410ovn.lab.psi.pnq2.redhat.com   Ready    worker   6d6h   v1.23.12+8a6bfe4
      worker-2.sharedocp4upi410ovn.lab.psi.pnq2.redhat.com   Ready    worker   6d6h   v1.23.12+8a6bfe4
      
      
      3 worker machines
      
      node utilization: current
      worker-0.sharedocp4upi410ovn.lab.psi.pnq2.redhat.com   857m         24%    4820Mi          70%       
      worker-1.sharedocp4upi410ovn.lab.psi.pnq2.redhat.com   500m         14%    5793Mi          84%       
      worker-2.sharedocp4upi410ovn.lab.psi.pnq2.redhat.com   303m         8%     3209Mi          46%    
      
      Additional TEST done on the cluster with a sample/test deployment
      
      1. Created a new project:
      
      $oc new-project dep-test
      Now using project "dep-test" on server "https://api.sharedocp4upi410ovn.lab.psi.pnq2.redhat.com:6443".
      
      You can add applications to this project with the 'new-app' command. For example, try:
      
          oc new-app rails-postgresql-example
      
      to build a new example application in Ruby. Or use kubectl to deploy a simple Kubernetes application:
      
          kubectl create deployment hello-node --image=k8s.gcr.io/e2e-test-images/agnhost:2.33 -- /agnhost serve-hostname
      
      
      2. Created a sample deploy:
      
      [quicklab@upi-0 deploy]$ oc new-app httpd
      --> Found image f339827 (4 weeks old) in image stream "openshift/httpd" under tag "2.4-el8" for "httpd"
      
          Apache httpd 2.4 
          ---------------- 
          Apache httpd 2.4 available as container, is a powerful, efficient, and extensible web server. Apache supports a variety of features, many implemented as compiled modules which extend the core functionality. These can range from server-side programming language support to authentication schemes. Virtual hosting allows one Apache installation to serve many different Web sites.
      
          Tags: builder, httpd, httpd-24
      
      
      --> Creating resources ...
          deployment.apps "httpd" created
          service "httpd" created
      --> Success
          Application is not exposed. You can expose services to the outside world by executing one or more of the commands below:
           'oc expose service/httpd' 
          Run 'oc status' to view your app.
      
      
      3. Get the pods and the node scheduled 
      $ oc get pods -o wide
      NAME                     READY   STATUS    RESTARTS   AGE   IP            NODE                                                   NOMINATED NODE   READINESS GATES
      httpd-795f6dddf9-c8vrj   1/1     Running   0          10s   10.131.0.35   worker-2.sharedocp4upi410ovn.lab.psi.pnq2.redhat.com   <none>           <none>
      
      
      4. Now scale the pod to 3 since we have three worker nodes
      
      $ oc scale deploy httpd --replicas=3
      deployment.apps/httpd scaled
      [quicklab@upi-0 deploy]$ oc get pods -o wide
      NAME                     READY   STATUS              RESTARTS   AGE   IP            NODE                                                   NOMINATED NODE   READINESS GATES
      httpd-795f6dddf9-2ldbm   0/1     ContainerCreating   0          3s    <none>        worker-1.sharedocp4upi410ovn.lab.psi.pnq2.redhat.com   <none>           <none>
      httpd-795f6dddf9-c44tw   0/1     ContainerCreating   0          3s    <none>        worker-0.sharedocp4upi410ovn.lab.psi.pnq2.redhat.com   <none>           <none>
      httpd-795f6dddf9-c8vrj   1/1     Running             0          38s   10.131.0.35   worker-2.sharedocp4upi410ovn.lab.psi.pnq2.redhat.com   <none>           <none>
      
      [quicklab@upi-0 deploy]$ oc get pods -o wide
      NAME                     READY   STATUS    RESTARTS   AGE     IP             NODE                                                   NOMINATED NODE   READINESS GATES
      httpd-795f6dddf9-2ldbm   1/1     Running   0          119s    10.129.2.54    worker-1.sharedocp4upi410ovn.lab.psi.pnq2.redhat.com   <none>           <none>
      httpd-795f6dddf9-c44tw   1/1     Running   0          119s    10.128.2.186   worker-0.sharedocp4upi410ovn.lab.psi.pnq2.redhat.com   <none>           <none>
      httpd-795f6dddf9-c8vrj   1/1     Running   0          2m34s   10.131.0.35    worker-2.sharedocp4upi410ovn.lab.psi.pnq2.redhat.com   <none>           <none>
      
      
      Thus the default profile is scheduling the pods to three nodes evenly!

      Actual results:

      - The automation-jobs are only scheduled to one node
      - From the scheduler logs, could see all the worker nodes are evaluated for the scheduling, but it is continuously getting scheduled to one node
      
      automation-job-12-qgpjg                                           1/1     Running             0             2s    10.129.2.49    worker-1.sharedocp4upi410ovn.lab.psi.pnq2.redhat.com   <none>           <none>
      automation-job-16-ftmhz                                           1/1     Running             0             3s    10.129.2.52    worker-1.sharedocp4upi410ovn.lab.psi.pnq2.redhat.com   <none>           <none>
      automation-job-17-82fdj                                           1/1     Running             0             4s    10.129.2.51    worker-1.sharedocp4upi410ovn.lab.psi.pnq2.redhat.com   <none>           <none>
      automation-job-13-gzclq                                           1/1     Running             0             3s    10.129.2.53    worker-1.sharedocp4upi410ovn.lab.psi.pnq2.redhat.com   <none>           <none>

      Expected results:

      The AAP automation jobs should also be distributed evenly on to the nodes

      Additional info:

      - Default instance groups was used in the AAP which would be similar as follows,
      
      apiVersion: v1
      kind: Pod
      metadata:
        namespace: ansible-automation-platform
      spec:
        serviceAccountName: default
        automountServiceAccountToken: false
        containers:
          - image: >-
              registry.redhat.io/ansible-automation-platform-22/ee-supported-rhel8@sha256:a77ac9d7fd9f73a07aa5f771d546bd50281495f9f39d5a34c4ecf2888a1a70c0
            name: worker
            args:
              - ansible-runner
              - worker
              - '--private-data-dir=/runner'
            resources:
              requests:
                cpu: 250m
                memory: 100Mi
      
      - When adding the pod topology constraints the pod schedule is almost even to the nodes.
       
      Doc Reference for ansible : https://access.redhat.com/documentation/en-us/red_hat_ansible_automation_platform/2.3/html/red_hat_ansible_automation_platform_performance_considerations_for_operator_based_installations/assembly-specify-dedicted-nodes#doc-wrapper
      
      Doc Reference for OCP: https://docs.openshift.com/container-platform/4.11/nodes/scheduling/nodes-scheduler-pod-topology-spread-constraints.html#nodes-scheduler-pod-topology-spread-constraints-about_nodes-scheduler-pod-topology-spread-constraints
      
       - When suggested the same with the CU , they were not satisfied with the the solution as they think its relatively a workaround
       - Since this was reproducible would need to check and suggest if the provided solution to add the topology constraints is a recommended solution here

       

      Business Impact:

      The few nodes, where the JOBS are frequently scheduled, are sometimes getting over utilized and results in the failure of the automation job pods and blocks the work flow

              jchaloup@redhat.com Jan Chaloupka
              rhn-support-vpavithr Vishnudutt Pavithran (Inactive)
              None
              None
              Sunil Choudhary Sunil Choudhary
              None
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated:
                Resolved: