-
Story
-
Resolution: Unresolved
-
Normal
-
None
-
None
-
None
-
False
-
-
False
-
None
-
None
-
None
-
None
Description:
OLMv0 periodically polls CatalogSources to check for catalog upgrades. Currently,
this polling mechanism always creates a CatalogSource pod, regardless of whether
the catalog image has actually changed.
OLMv0 performs the following actions on every polling interval (10 minutes historically, 15 minutes by default today):
- Pull the CatalogSource image
- Create a CatalogSource pod
- Compare it with the existing catalog pod using the pod hash
- Switch traffic if a real update is detected
When there is no upgrade (i.e., no catalog image change), OLMv0 still performs
the following operations:
- GET ServiceAccount
- GET NetworkPolicy (x2)
- GET Service
- LIST Pods
- CREATE Pod (writes to etcd)
- Wait for the Pod to become Ready
- DELETE Pod (writes to etcd)
- PATCH CatalogSource/status (writes to etcd)
With 4 default CatalogSources, this behavior results in approximately:
- 72 etcd writes per hour
- 1,728 etcd writes per day
As part of OCPBUGS-69441, the default polling interval is being increased to
mitigate the impact of this behavior. However, this is fundamentally a design
issue rather than a tuning problem.
A potential long term improvement would be to query the container registry
(e.g., via manifest digest) to determine whether the catalog image has changed
before creating a CatalogSource pod, and only proceed with pod creation when a
real update is detected.
Given that OLMv0 is in maintenance mode, this work is explicitly captured as
tech debt and is out of scope for the current bug fix.
For now, short term workaround has landed via via operator-marketplace through PR https://github.com/operator-framework/operator-marketplace/pull/695
Acceptance Criteria:
- A design is identified that allows OLMv0 to determine whether a catalog image
has changed prior to creating a CatalogSource pod.
- When no catalog image change is detected, OLMv0 would avoid pod creation and
associated etcd writes.
- Any proposed change preserves existing CatalogSource upgrade behavior and
failure modes.