-
Bug
-
Resolution: Done
-
Major
-
4.11
-
None
-
Moderate
-
None
-
Proposed
-
False
-
-
-
Bug Fix
-
Done
Description of problem:
Install a single node cluster on AWS, then enable TechPreview, cause the cluster error. The CMA and CAPI CMA shouldn't be on the same port.
Version-Release number of selected component (if applicable):
4.11.9
How reproducible:
always
Steps to Reproduce:
1.Launch 4.11.9 single node cluster on AWS liuhuali@Lius-MacBook-Pro huali-test % oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.11.9 True False 34m Cluster version is 4.11.9 liuhuali@Lius-MacBook-Pro huali-test % oc get co NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE authentication 4.11.9 True False False 31m baremetal 4.11.9 True False False 49m cloud-controller-manager 4.11.9 True False False 52m cloud-credential 4.11.9 True False False 53m cluster-autoscaler 4.11.9 True False False 48m config-operator 4.11.9 True False False 50m console 4.11.9 True False False 37m csi-snapshot-controller 4.11.9 True False False 49m dns 4.11.9 True False False 48m etcd 4.11.9 True False False 47m image-registry 4.11.9 True False False 43m ingress 4.11.9 True False False 86s insights 4.11.9 True False False 43m kube-apiserver 4.11.9 True False False 43m kube-controller-manager 4.11.9 True False False 47m kube-scheduler 4.11.9 True False False 44m kube-storage-version-migrator 4.11.9 True False False 50m machine-api 4.11.9 True False False 44m machine-approver 4.11.9 True False False 49m machine-config 4.11.9 True False False 49m marketplace 4.11.9 True False False 48m monitoring 4.11.9 True False False 56s network 4.11.9 True False False 52m node-tuning 4.11.9 True False False 49m openshift-apiserver 4.11.9 True False False 72s openshift-controller-manager 4.11.9 True False False 39m openshift-samples 4.11.9 True False False 43m operator-lifecycle-manager 4.11.9 True False False 49m operator-lifecycle-manager-catalog 4.11.9 True False False 49m operator-lifecycle-manager-packageserver 4.11.9 True False False 104s service-ca 4.11.9 True False False 50m storage 4.11.9 True False False 49m liuhuali@Lius-MacBook-Pro huali-test % oc get node NAME STATUS ROLES AGE VERSION ip-10-0-137-222.us-east-2.compute.internal Ready master,worker 53m v1.24.0+dc5a2fd 2.Enable TechPreview spec: featureSet: TechPreviewNoUpgrade liuhuali@Lius-MacBook-Pro huali-test % oc edit featuregate featuregate.config.openshift.io/cluster edited 3.Check the cluster liuhuali@Lius-MacBook-Pro huali-test % oc get pod -n openshift-cloud-controller-manager NAME READY STATUS RESTARTS AGE aws-cloud-controller-manager-5888c85fc6-28tgt 1/1 Running 12 (10m ago) 55m liuhuali@Lius-MacBook-Pro huali-test % oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.11.9 True False 111m Error while reconciling 4.11.9: the workload openshift-cluster-machine-approver/machine-approver-capi has not yet successfully rolled out liuhuali@Lius-MacBook-Pro huali-test % oc get co NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE authentication 4.11.9 False False False 9m44s OAuthServerRouteEndpointAccessibleControllerAvailable: Get "https://oauth-openshift.apps.huliu-aws411arn2.qe.devcluster.openshift.com/healthz": context deadline exceeded (Client.Timeout exceeded while awaiting headers)... baremetal 4.11.9 True False False 128m cloud-controller-manager 4.11.9 True False False 131m cloud-credential 4.11.9 True False False 133m cluster-api 4.11.9 True False False 41m cluster-autoscaler 4.11.9 True False False 128m config-operator 4.11.9 True False False 129m console 4.11.9 False True False 10m DeploymentAvailable: 0 replicas available for console deployment... csi-snapshot-controller 4.11.9 True False False 4m52s dns 4.11.9 True False False 128m etcd 4.11.9 True False False 127m image-registry 4.11.9 True False False 123m ingress 4.11.9 True False False 3m15s insights 4.11.9 True False False 122m kube-apiserver 4.11.9 True False False 123m kube-controller-manager 4.11.9 True False False 126m kube-scheduler 4.11.9 True False False 124m kube-storage-version-migrator 4.11.9 True False False 129m machine-api 4.11.9 True False False 124m machine-approver 4.11.9 True False False 128m machine-config 4.11.9 True False False 129m marketplace 4.11.9 True False False 128m monitoring 4.11.9 True False False 5m1s network 4.11.9 True False False 131m node-tuning 4.11.9 True False False 128m openshift-apiserver 4.11.9 True False False 23s openshift-controller-manager 4.11.9 True False False 118m openshift-samples 4.11.9 True False False 122m operator-lifecycle-manager 4.11.9 True False False 128m operator-lifecycle-manager-catalog 4.11.9 True False False 128m operator-lifecycle-manager-packageserver 4.11.9 True False False 2m43s service-ca 4.11.9 True False False 129m storage 4.11.9 True False False 69m liuhuali@Lius-MacBook-Pro huali-test %
Actual results:
Cluster is broken CMA is complaining, message: '0/1 nodes are available: 1 node(s) didn''t have free ports for the requested pod ports. preemption: 0/1 nodes are available: 1 node(s) didn''t have free ports for the requested pod ports.'
Expected results:
Cluster should be healthy
Additional info:
Talked with dev here https://coreos.slack.com/archives/GE2HQ9QP4/p1666178083034159?thread_ts=1666176493.224399&cid=GE2HQ9QP4 Must-Gather https://drive.google.com/file/d/1Q7Ddnhbg3Cq4ptBA2ycJnGKK01As1JcF/view?usp=sharing If enable TechPreview during installation on single node cluster, the cluster installation failed.