Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-77045

Reduce restore status polling interval from 30s to improve IBU upgrade time

    • Icon: Bug Bug
    • Resolution: Not a Bug
    • Icon: Normal Normal
    • None
    • 4.20.z
    • LCA operator
    • None
    • None
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

      During the post-pivot restore phase of an IBU upgrade, the LCA polls every 30 seconds (requeueWithShortInterval) to check if a restore has completed. In practice, most individual restores complete in under 5 seconds (confirmed via Velero logs), so the controller spends most of the restore phase idle waiting rather than processing.
      
      * requeueWithShortInterval (30s): https://github.com/openshift-kni/lifecycle-agent/blob/release-4.20/controllers/ibu_controller.go#L103-L105
      * Called from HandleRestore: https://github.com/openshift-kni/lifecycle-agent/blob/release-4.20/controllers/upgrade_handlers.go#L550-L552
          

      Version-Release number of selected component (if applicable):

      lifecycle-agent v4.20.1 (the value is hardcoded and has not changed across versions)
          

      How reproducible:

      Always. The 30s interval is hardcoded and applies to every IBU upgrade with OADP restores.
      
          

      Steps to Reproduce:

          1. Configure an IBU upgrade with multiple restore waves
          2. Trigger the upgrade
          3. After the reboot, observe the LCA logs during the restore phase
          4. Note the 30s gap between each restore completion
          

      Actual results:

      The LCA waits 30 seconds between restore status checks, even when the restore completes in under 5 seconds.
          

      Expected results:

      The polling interval should be reduced to 5-10 seconds to minimize idle waiting and improve the overall IBU upgrade time.
          

      Additional info:

      Note that requeueWithShortInterval is also used in other parts of the LCA code.
          

              jche@redhat.com Jun Chen
              dmunneor1@redhat.com Daniel Munne Ortega
              None
              None
              Yang Liu Yang Liu
              None
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated:
                Resolved: