Uploaded image for project: 'OpenShift Hive'
  1. OpenShift Hive
  2. HIVE-2419

Use spot instances for AWS e2e CI jobs

XMLWordPrintable

    • Icon: Story Story
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • None
    • None
    • False
    • None
    • False

      Spot instances are way cheaper. Like 70-90% cheaper. Downside: they can theoretically get yanked at any time. In practice, though, this tends to happen <1%/24h (according to James Russell). So for CI e2e jobs, which last O(2h), the likelihood is very small... and as long as we can distinguish flakes that happen for this reason, even manual /retest ing makes this worth the savings if it's anywhere close to the above estimates.

      Links from James:

      Hive already has some accommodation for spot instances in MachinePools and the hibernation controller, so it should be possible to request spot instances through... the install-config?

      Problem for ClusterPools!

      I believe we terminate (delete) spot instances on hibernation, and let MAPI recreate them when we resume the cluster. But when we create the default MachinePool to go with a ClusterPool cluster, we're not copying out (the relevant portions of) the install-config. So upon resume, I believe we'll end up creating the wrong instance types. See HIVE-2256 for more background on this. But I think what it means is that spot instances + clusterpools is dead in the water until we can address that card. Which by extension means:

      • We can't reasonably cut our OSCI clusterpools over to spot instances until (at least this part of) HIVE-2256 is addressed.
      • e2e-pool. If we use spot instances for the pool we create inside the test... it might run. It might even "succeed". But at some point in there it'll end up recreating on-demand instances. This would at best reduce our cost savings.

              efried.openshift Eric Fried
              efried.openshift Eric Fried
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated: