Uploaded image for project: 'Automation Hub'
  1. Automation Hub
  2. AAH-52

pylint in ansible-test causing importer fails in OCP

    • AAH 4.3.0 Sprint 3, AAH Sprint 4, AAH Sprint 5

      In OCP `galaxy-importer` calls `ansible-test` inside a container with restricted resources. `ansible-test` calls `pylint` with `jobs` set to zero https://github.com/ansible/ansible/blob/devel/test/lib/ansible_test/_internal/sanity/pylint.py#L234, and this causes `pylint` to launch as many processes as cores it can see (in OCP testing, 8 cores), and in this case each process has 1/8th of the cpu and memory. This causes slow execution of pylint (local testing showing processes waiting on IO).

      Previous issues with our container + pylint, [too small resources](https://github.com/ansible/galaxy_ng/issues/64) causing broken pipe, and an [intermittent OOM](https://github.com/ansible/galaxy_ng/issues/230) error, were resolved with higher resources.

      For the current issue where slow execution eventually hits container timeout, possible solutions:

      1. Restrict the number of processes pylint can spawn by setting `jobs` to a positive number
      2. Increase CPU resource on `ansible-test` job container (i.e. to `1000m`). But this cannot be set too high, since a high CPU limit can cause the job to take longer to schedule and potentially not start
      3. Increase timeout - note even with `IMPORTER_JOB_TIMEOUT` set https://github.com/ansible/galaxy-importer/blob/master/galaxy_importer/ansible_test/runners/openshift_job.py#L239 to 15min, timeout still occurs at ~10min, there may be another setting needed to increase the default 10min timeout.

              cspealma@redhat.com Clara Spealman (Inactive)
              chousekn Chris Houseknecht (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated:
                Resolved: