Uploaded image for project: 'Ansible Automation Platform RFEs'
  1. Ansible Automation Platform RFEs
  2. AAPRFE-430

RFE - Make the mesh aware of which execution node can reach which managed nodes

XMLWordPrintable

    • False
    • Hide

      None

      Show
      None
    • False

      1. What is the nature and description of the request?

      User story:
      As a customer, I'd like to not worry about where my automation is executed but rather that it is executed.

      In complex environments, there are often different segregated networks implemented. These segregated networks typically require a "jump/bastion host" to be used in order to access these networks in order to automate the managed nodes that reside in these networks.
      Right now, the Automation Mesh is not aware of which managed nodes are reachable through which execution nodes. This requires users of AAP to split inventories and create different instance groups (e.g. per segregated network segment).
      The created instance groups have to be assigned to a job template together with only those managed nodes that are reachable by all instances of the instance groups assigned to the job template.
      Naturally, this introduces overhead and complexity, which is not desirable, as it will increase operational costs and operational maintenance.

      Ideally, the Automation Mesh would be "smarter" and would be aware of which managed nodes are reachable through which execution nodes and thus decides "itself" which execution node to use for a particular managed node.

      2. Why does the customer need this? (List the business requirements here)
      With AAP 1.2, bastion hosts have been used to access the segregated network segments. While using SSH as the communication protocol, the customer set a host_var for each and every managed node that would override the ssh_extra_args and specify the bastion host to use.
      With the introduction of AAP 2.x, bastion hosts were replaced with hop nodes, which communicate in the Automation Mesh using receptor, with which it is impossible to achieve the same functionality without introducing more complexity to the environment.

      Currently, three possible workarounds have been proposed (which all have their downsides):
      1. Split the inventory into smaller junks, create a job template per segregated network and assign the correct execution node via instance groups. Depending on the number of segregated networks, this will effectively increase the number of inventories,
         instance groups and job templates by 10-15 times (with 10-15 segregated networks). Of course, the different job templates could be run concurrently with a workflow template.
      2. Modify each job template to be a workflow template that includes a job that will split the inventory on the fly, assign the correct instance group and execute the original job template as many times as required
      3. Based on 2. the same can be achieved with a new job template (instead of a workflow) that can be used to call the original job template and spawn the job as many times as required

      All options have the downside that it gets hard for the user running the automation to keep track of the automation, as multiple jobs have to be monitored. Further, splitting the jobs into smaller junks effectively eliminates things like the execution
      strategy (https://docs.ansible.com/ansible/latest/playbook_guide/playbooks_strategies.html#selecting-a-strategy) or the serial parameter (https://docs.ansible.com/ansible/latest/playbook_guide/playbooks_strategies.html#setting-the-batch-size-with-serial) as the execution
      happens in different smaller jobs, which are not taking the bigger picture into account. For instance: With a serial value that would first run on one host and execute the remaining hosts after the first host has successfully completed the automation, one host per spawned job will fail instead of only one host.
      Additionally, all options increase (as said) operational costs and operational maintenance.

      3. How would you like to achieve this? (List the functional requirements here)
      The Automation Mesh should be "smart" and be aware of which managed nodes are reachable through which execution nodes.
      Ideally, all instance groups can be added to one job template, and Automation Mesh would pick the appropriate one to execute the automation.
      Alternatively, a new variable could be introduced that replicates the behavior of the original ssh_extra_vars variable, such as receptor_extra_args where one could specify the execution node to use for each individual host (hostvars) or group (groupvars).

      3. List any affected known dependencies: Doc, UI etc..
      Controller, Mesh, Docs, UI

      5. Github Link if any
      N/A
       

              bcoursen@redhat.com Brian Coursen
              rhn-support-sscheib Steffen Scheib
              Votes:
              25 Vote for this issue
              Watchers:
              37 Start watching this issue

                Created:
                Updated: