-
Bug
-
Resolution: Done-Errata
-
Critical
-
4.18
Description of problem:
Hypershift CSC KasGoMemLimit validation is not accepting the right unit suffixes, leading to crashlooping KAS pods on using it.
https://pkg.go.dev/runtime#hdr-Environment_Variables - this variable should be in bytes, with units KiB or MiB or GiB
CSC is validating it for Ki or Mi or Gi
# clustersizingconfigurations.scheduling.hypershift.openshift.io "cluster" was not valid: # * spec.sizes[0].effects.kasGoMemLimit: Invalid value: "24576MiB": spec.sizes[0].effects.kasGoMemLimit in body should match '^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$'
KAS crashes with wrong GOMEMLIMIT unit
GOMEMLIMIT=24576Mi
fatal error: malformed GOMEMLIMIT; see `go doc runtime/debug.SetMemoryLimit`
runtime stack:
runtime.throw({0x401b7aa?, 0xa?})
runtime/panic.go:1023 +0x5c fp=0x7fffdebe4378 sp=0x7fffdebe4348 pc=0x4400fc
runtime.readGOMEMLIMIT()
runtime/mgcpacer.go:1331 +0xaf fp=0x7fffdebe43a8 sp=0x7fffdebe4378 pc=0x429bcf
runtime.gcinit()
runtime/mgc.go:187 +0x2e fp=0x7fffdebe43d8 sp=0x7fffdebe43a8 pc=0x42110e
runtime.schedinit()
runtime/proc.go:805 +0x1af fp=0x7fffdebe4450 sp=0x7fffdebe43d8 pc=0x44406f
runtime.rt0_go()
runtime/asm_amd64.s:349 +0x15a fp=0x7fffdebe4458 sp=0x7fffdebe4450 pc=0x47919a
It works if I set the environment variable to GOMEMLIMIT=24576MiB instead
Version-Release number of selected component (if applicable):
How reproducible:
Always
Steps to Reproduce:
1. Install HO with size tagging 2. Set CSC to use KasGoMemLimit - https://github.com/openshift/hypershift/blob/main/docs/content/how-to/azure/scheduler.md#effects 3. Create HC and watch KAS pod crashing
Actual results:
GOMEMLIMIT=24576Mi fatal error: malformed GOMEMLIMIT; see `go doc runtime/debug.SetMemoryLimit`
Expected results:
Should allow the right Unit suffix, GOMEMLIMIT=24576MiB
Additional info:
- links to
-
RHEA-2024:11038 OpenShift Container Platform 4.19.z bug fix update