Loading...

XML

Word

Printable

Type: Bug
Resolution: Done
Priority: Medium
Fix Version/s: OSC 1.6.0
Affects Version/s: OSC 1.5.2
Component/s: Operator
Labels:
None

Blocked:
False
Blocked Reason:
None
Ready:
False
Release Note Text:

Hide
.`controller-manager` pod fails with `out of memory` errors

Previously, when the {osc-operator} was deployed on a single-node, bare-metal cluster running {openshift} 4.14.12, the `controller-manager` pod failed with `out of memory` errors. In the current release, the issue has been fixed by increasing the pod's resources.

Show
.`controller-manager` pod fails with `out of memory` errors Previously, when the {osc-operator} was deployed on a single-node, bare-metal cluster running {openshift} 4.14.12, the `controller-manager` pod failed with `out of memory` errors. In the current release, the issue has been fixed by increasing the pod's resources.
Release Note Type:
Bug Fix
Release Note Status:
Done
Intelligence Requested:
Market:

Cost of Delay:
0
WSJF:
0

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

Description

I deployed the OpenShift Sandboxed Containers Operator onto a single node bare metal cluster running OCP 4.14.12 (x86_64). The controller manager pod was continually OOMKilled until I manually adjusted the limits in the controller-manager deployment

Steps to reproduce

Deploy OSC operator on single node bare metal cluster
Watch controller-manager pod be OOMKilled repeatedly
Manually change the resource limits in the controller-manager deployment to a higher setting
controller manager pod is no longer OOMKilled

Expected result

controller manager pod is not OOMKilled when using the default resource limits provided for the controller manager deployment by the operator.

Actual result

controller manager pod continually OOMKilled

Impact

As far as I know, OSC containers cannot be deployed without a functioning controller manager. This error would appear to block the usage of OSC on a single node cluster until the workaround is applied.

Env

OCP 4.14.2

OSC version 1.5.2

Single Node OCP on bare metal with 64 CPU cores and 128GBs of memory

Additional helpful info

The error was resolved when I manually edited the controller manager deployment with the following limits.

```

resources:
limits:
cpu: 999m
memory: 999Mi
requests:
cpu: 999m
memory: 999Mi

```

The numbers above were chosen at random just to see if it would work. I did not test what the minimum value increase would be to avoid the problem. I have attached the controller manager logs to this ticket as well. The logs don't show the OOMKill messages. Only the last message from the pod before it is killed. The last message was always the same in my testing, "Creating sandboxed containers dashboard in the OpenShift console"

manager.logs

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

manager.logs
7 kB
2024/02/29 12:44 AM

is related to

KATA-3000 controller-manager hits OOM at image-creation time

Closed

links to

RHBA-2024:127642 RHBA: sandboxed-containers bug fix and enhancement update

mentioned on

Merge request - Increase memory limit to avoid OOMKilled

Assignee:: Cameron Meadors

Reporter:: Dan Clark

Need Info From:: Dan Clark

Doc Contact:: Avital Pinnick

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Created:: 2024/02/29 12:45 AM

Updated:: 2024/06/19 5:53 AM

Resolved:: 2024/05/16 8:16 PM

Details

Description

Description

Steps to reproduce

Expected result

Actual result

Impact

Env

Additional helpful info

Attachments

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates