-
Bug
-
Resolution: Unresolved
-
Undefined
-
None
-
None
-
False
-
-
False
-
-
Description
CMMO 4.2.0 is vulnerable to the source creation race condition described in FLPATH-3064. When a new cluster registers, koku's source creation POST takes >30s, causing the CMMO HTTP client (30s timeout) to time out. CMMO then proceeds to upload in the same reconciliation cycle without confirming source creation succeeded. The koku listener receives the upload before the provider is committed, treats it as an "unexpected OCP report", and silently discards the payload. This data is permanently lost – the operator does not retry until the next upload cycle (default 6 hours).
CMMO 4.3.0 is protected from this race condition because it introduced a code change requiring a source to be defined before accepting optimization reports (the fix for FLPATH-2934). CMMO 4.2.0 does not have this client-side protection.
Since modifying CMMO 4.2.0 is not an option (see FLPATH-3064: "We don't want to change CMMO code"), a server-side fix in koku is needed to protect CMMO 4.2.0 from this race condition.
Problem
- CMMO 4.2.0 uploads data immediately after attempting source creation, even if the source creation timed out
- Koku listener discards the upload because the provider is not yet committed
- First upload data is permanently lost; next retry is 6 hours later
- CMMO 4.3.0 is not affected due to its client-side source validation guard
Proposed Fix Direction
Implement a koku-side mechanism analogous to the CMMO 4.3.0 client-side protection. Possible approaches:
- Queue and retry: When the koku listener receives an upload for an unknown cluster, queue the payload and retry processing after source creation completes, rather than discarding it
- Hold and wait: If source creation is in progress for a given cluster ID, have the listener wait for completion before processing the upload
- Graceful rejection: Return an HTTP error code (e.g. 503 Retry-After) instead of 202, signaling CMMO to retry sooner than the default 6-hour cycle
Related Issues
- FLPATH-3064 – CMMO gets a timeout when it attempts to create a new source (root cause of the race condition)
- FLPATH-2934 – CMMO 4.3.0 CSVs not processed by Insights On-Prem (resolved; contains the client-side fix that protects 4.3.0)
Version Information
- Affected CMMO version: 4.2.0
- Not affected: CMMO 4.3.0 (has client-side source validation guard)
- Koku image: koku:sources (2026-02-01 build, sources integrated into koku)
- relates to
-
FLPATH-3064 CMMO gets a timeout when it attempts to create a new source
-
- Backlog
-
-
FLPATH-2934 Cost Management Operator 4.3.0 CSVs not processed by Insights On-Prem
-
- Closed
-