Loading...

XML

Word

Printable

Type: Task
Resolution: Unresolved
Priority: Minor
Fix Version/s: None
Affects Version/s: None
Component/s: CI
Labels:
None

Activity Type:
Quality / Stability / Reliability
Epic Link:
ROX-25640
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Color Status:
Not Selected
Intelligence Requested:
Market:

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

Overview:

Currently the timeout for image prefetching is purely heuristic, and the same for all images. We follow a very simple exponential pattern: first download has a 30s timeout, next one 1 minute, then 2 minutes and so on

The problem is that some images are tiny (so we waste time waiting too long on stalled downloads), and others large, and cannot possibly be fetched in less than 30 seconds (so we waste the first one or two attempts).

One thing that would maximize the robustness of prefetching would be to add a peer-to-peer communication between prefetcher pods such that they can exchange information on how long a successful prefetching of a given image took.

This way each prefetcher pod would be able to fine-tune the timeout for each image separately, and thus take more download attempts rather than keep idle while a given stalled download sits there waiting for an unrealistically-high deadline calculated by exponential backoff.

In the failed CI job for the linked ticket, this could have even doubled the number of attempts, increasing the probability of success.

Acceptance Criteria:

A list of specific needs or objectives that this task must deliver in order to be considered complete. Complete during Refinement status.

Assignee:: Unassigned

Reporter:: Marcin Owsiany

Team:: ACS Install

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Created:: 2025/11/17 8:11 AM

Updated:: 2025/11/26 9:39 AM

Details

Description

Overview:

Acceptance Criteria:

Attachments

Easy Agile Planning Poker

Activity

People

Dates