-
Bug
-
Resolution: Duplicate
-
Normal
-
None
-
4.13, 4.12, 4.11, 4.10
-
Important
-
None
-
Build + Jenkins Sprint 231, Build + Jenkins Sprint 232, Build + Jenkins Sprint 233
-
3
-
Rejected
-
False
-
-
N/A
-
Bug Fix
-
Done
Discovered in the must gather kubelet_service.log from https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.12-upgrade-from-stable-4.11-e2e-gcp-sdn-upgrade/1586093220087992320
It appears the guard pod names are too long, and being truncated down to where they will collide with those from the other masters.
From kubelet logs in this run:
❯ grep openshift-kube-scheduler-guard-ci-op-3hj6pnwf-4f6ab-lv57z-maste kubelet_service.log Oct 28 23:58:55.693391 ci-op-3hj6pnwf-4f6ab-lv57z-master-1 kubenswrapper[1657]: E1028 23:58:55.693346 1657 kubelet_pods.go:413] "Hostname for pod was too long, truncated it" podName="openshift-kube-scheduler-guard-ci-op-3hj6pnwf-4f6ab-lv57z-master-1" hostnameMaxLen=63 truncatedHostname="openshift-kube-scheduler-guard-ci-op-3hj6pnwf-4f6ab-lv57z-maste" Oct 28 23:59:03.735726 ci-op-3hj6pnwf-4f6ab-lv57z-master-0 kubenswrapper[1670]: E1028 23:59:03.735671 1670 kubelet_pods.go:413] "Hostname for pod was too long, truncated it" podName="openshift-kube-scheduler-guard-ci-op-3hj6pnwf-4f6ab-lv57z-master-0" hostnameMaxLen=63 truncatedHostname="openshift-kube-scheduler-guard-ci-op-3hj6pnwf-4f6ab-lv57z-maste" Oct 28 23:59:11.168082 ci-op-3hj6pnwf-4f6ab-lv57z-master-2 kubenswrapper[1667]: E1028 23:59:11.168041 1667 kubelet_pods.go:413] "Hostname for pod was too long, truncated it" podName="openshift-kube-scheduler-guard-ci-op-3hj6pnwf-4f6ab-lv57z-master-2" hostnameMaxLen=63 truncatedHostname="openshift-kube-scheduler-guard-ci-op-3hj6pnwf-4f6ab-lv57z-maste"
This also looks to be happening for openshift-kube-scheduler-guard, kube-controller-manager-guard, possibly others.
Looks like they should be truncated further to make room for random suffixes in https://github.com/openshift/library-go/blame/bd9b0e19121022561dcd1d9823407cd58b2265d0/pkg/operator/staticpod/controller/guard/guard_controller.go#L97-L98
Unsure of the implications here, it looks a little scary.
- clones
-
OCPBUGS-4551 Backport 4.11: Guard Pod Hostnames Too Long and Truncated Down Into Collisions With Other Masters
- Closed
- depends on
-
OCPBUGS-4551 Backport 4.11: Guard Pod Hostnames Too Long and Truncated Down Into Collisions With Other Masters
- Closed
- links to