-
Bug
-
Resolution: Done-Errata
-
Minor
-
None
-
4.11
-
Moderate
-
None
-
Unspecified
-
If docs needed, set a value
Description of problem:
'oc get node' cannot return the node which miss AWS DNS suffix on the cluster created with feature gate
Version-Release number of selected component (if applicable):
4.11.0-0.nightly-2022-05-11-054135
How reproducible:
Always
Steps to Reproduce:
1.Create dhcp-options-set
liuhuali@Lius-MacBook-Pro huali-test % aws ec2 create-dhcp-options --dhcp-configurations '[
]'
DHCPOPTIONS dopt-0c9dfcde919f49105 301721915996
DHCPCONFIGURATIONS domain-name-servers
VALUES AmazonProvidedDNS
liuhuali@Lius-MacBook-Pro huali-test %
2.Install a cluster with feature gate like this:
./openshift-install create install-config --log-level=debug --dir=cluster1
./openshift-install create manifests --log-level=debug --dir=cluster1
vi cluster1/manifests/manifest_feature_gate.yaml
apiVersion: config.openshift.io/v1
kind: FeatureGate
metadata:
annotations:
include.release.openshift.io/self-managed-high-availability: "true"
include.release.openshift.io/single-node-developer: "true"
release.openshift.io/create-only: "true"
name: cluster
spec:
featureSet: TechPreviewNoUpgrade
./openshift-install create cluster --log-level=debug --dir=cluster1
3.After installation, check the cluster is ok, 'oc get node' return 6 nodes
liuhuali@Lius-MacBook-Pro huali-test % oc get machines.machine.openshift.io -o wide
NAME PHASE TYPE REGION ZONE AGE NODE PROVIDERID STATE
huliu-aws411ccm-ktgjv-master-0 Running m6i.xlarge us-east-2 us-east-2a 59m ip-10-0-142-194.us-east-2.compute.internal aws:///us-east-2a/i-05d96395caa887d8a running
huliu-aws411ccm-ktgjv-master-1 Running m6i.xlarge us-east-2 us-east-2b 59m ip-10-0-188-250.us-east-2.compute.internal aws:///us-east-2b/i-062357b65874125d0 running
huliu-aws411ccm-ktgjv-master-2 Running m6i.xlarge us-east-2 us-east-2c 59m ip-10-0-193-79.us-east-2.compute.internal aws:///us-east-2c/i-0a220248387b666a8 running
huliu-aws411ccm-ktgjv-worker-us-east-2a-68lcj Running m6i.large us-east-2 us-east-2a 55m ip-10-0-137-131.us-east-2.compute.internal aws:///us-east-2a/i-07835e479d27914ea running
huliu-aws411ccm-ktgjv-worker-us-east-2b-wsdr9 Running m6i.large us-east-2 us-east-2b 55m ip-10-0-190-236.us-east-2.compute.internal aws:///us-east-2b/i-0ff467ae0b64f5e97 running
huliu-aws411ccm-ktgjv-worker-us-east-2c-mhf4h Running m6i.large us-east-2 us-east-2c 55m ip-10-0-193-47.us-east-2.compute.internal aws:///us-east-2c/i-0cda097a70aca5373 running
liuhuali@Lius-MacBook-Pro huali-test % oc get machines.machine.openshift.io
NAME PHASE TYPE REGION ZONE AGE
huliu-aws411ccm-ktgjv-master-0 Running m6i.xlarge us-east-2 us-east-2a 60m
huliu-aws411ccm-ktgjv-master-1 Running m6i.xlarge us-east-2 us-east-2b 60m
huliu-aws411ccm-ktgjv-master-2 Running m6i.xlarge us-east-2 us-east-2c 60m
huliu-aws411ccm-ktgjv-worker-us-east-2a-68lcj Running m6i.large us-east-2 us-east-2a 56m
huliu-aws411ccm-ktgjv-worker-us-east-2b-wsdr9 Running m6i.large us-east-2 us-east-2b 56m
huliu-aws411ccm-ktgjv-worker-us-east-2c-mhf4h Running m6i.large us-east-2 us-east-2c 56m
liuhuali@Lius-MacBook-Pro huali-test % oc get node
NAME STATUS ROLES AGE VERSION
ip-10-0-137-131.us-east-2.compute.internal Ready worker 54m v1.23.3+69213f8
ip-10-0-142-194.us-east-2.compute.internal Ready master 59m v1.23.3+69213f8
ip-10-0-188-250.us-east-2.compute.internal Ready master 60m v1.23.3+69213f8
ip-10-0-190-236.us-east-2.compute.internal Ready worker 54m v1.23.3+69213f8
ip-10-0-193-47.us-east-2.compute.internal Ready worker 54m v1.23.3+69213f8
ip-10-0-193-79.us-east-2.compute.internal Ready master 59m v1.23.3+69213f8
4.Swap the dhcp-options-set for the VPC with the one above
5.Delete a worker machine backed by a machineset, allowing MAPI to recreate the machine
liuhuali@Lius-MacBook-Pro huali-test % oc delete machines.machine.openshift.io huliu-aws411ccm-ktgjv-worker-us-east-2c-mhf4h
machine.machine.openshift.io "huliu-aws411ccm-ktgjv-worker-us-east-2c-mhf4h" deleted
liuhuali@Lius-MacBook-Pro huali-test % oc get node
NAME STATUS ROLES AGE VERSION
ip-10-0-137-131.us-east-2.compute.internal Ready worker 63m v1.23.3+69213f8
ip-10-0-142-194.us-east-2.compute.internal Ready master 68m v1.23.3+69213f8
ip-10-0-188-250.us-east-2.compute.internal Ready master 69m v1.23.3+69213f8
ip-10-0-190-236.us-east-2.compute.internal Ready worker 63m v1.23.3+69213f8
ip-10-0-193-79.us-east-2.compute.internal Ready master 68m v1.23.3+69213f8
liuhuali@Lius-MacBook-Pro huali-test % oc get machines.machine.openshift.io -o wide
NAME PHASE TYPE REGION ZONE AGE NODE PROVIDERID STATE
huliu-aws411ccm-ktgjv-master-0 Running m6i.xlarge us-east-2 us-east-2a 70m ip-10-0-142-194.us-east-2.compute.internal aws:///us-east-2a/i-05d96395caa887d8a running
huliu-aws411ccm-ktgjv-master-1 Running m6i.xlarge us-east-2 us-east-2b 70m ip-10-0-188-250.us-east-2.compute.internal aws:///us-east-2b/i-062357b65874125d0 running
huliu-aws411ccm-ktgjv-master-2 Running m6i.xlarge us-east-2 us-east-2c 70m ip-10-0-193-79.us-east-2.compute.internal aws:///us-east-2c/i-0a220248387b666a8 running
huliu-aws411ccm-ktgjv-worker-us-east-2a-68lcj Running m6i.large us-east-2 us-east-2a 66m ip-10-0-137-131.us-east-2.compute.internal aws:///us-east-2a/i-07835e479d27914ea running
huliu-aws411ccm-ktgjv-worker-us-east-2b-wsdr9 Running m6i.large us-east-2 us-east-2b 66m ip-10-0-190-236.us-east-2.compute.internal aws:///us-east-2b/i-0ff467ae0b64f5e97 running
huliu-aws411ccm-ktgjv-worker-us-east-2c-58cml Running m6i.large us-east-2 us-east-2c 8m44s ip-10-0-200-145 aws:///us-east-2c/i-00c3c1b8ac9e27704 running
liuhuali@Lius-MacBook-Pro huali-test %
liuhuali@Lius-MacBook-Pro huali-test % oc get pod -n openshift-cluster-machine-approver
NAME READY STATUS RESTARTS AGE
machine-approver-5955745c76-5z6rq 2/2 Running 0 73m
machine-approver-capi-687b57b66d-lpv2q 2/2 Running 0 73m
liuhuali@Lius-MacBook-Pro huali-test % oc -n openshift-cluster-machine-approver logs -f machine-approver-5955745c76-5z6rq -c machine-approver-controller
…
I0512 02:23:42.526825 1 controller.go:121] Reconciling CSR: csr-9n978
I0512 02:23:42.545659 1 csr_check.go:157] csr-9n978: CSR does not appear to be client csr
I0512 02:23:42.552248 1 csr_check.go:545] retrieving serving cert from ip-10-0-200-145 (10.0.200.145:10250)
I0512 02:23:42.553087 1 csr_check.go:182] Failed to retrieve current serving cert: remote error: tls: internal error
I0512 02:23:42.553099 1 csr_check.go:202] Falling back to machine-api authorization for ip-10-0-200-145
I0512 02:23:42.558665 1 controller.go:240] CSR csr-9n978 approved
Actual results:
'oc get machines.machine.openshift.io -o wide' can see the newly created node(ip-10-0-200-145) miss AWS DNS suffix;
'oc get node' only return 5 nodes, miss the one newly created.
Expected results:
'oc get machines.machine.openshift.io -o wide' should see all nodes with AWS DNS suffix;
'oc get node' should return 6 nodes
Additional info:
Seems related to https://bugzilla.redhat.com/show_bug.cgi?id=2072195
some other cases:
Case1:
Repeat the above steps but change step 4 to 'Swap the dhcp-options-set for the VPC with one with domain-name'
'oc get node' can return the node newly created.
liuhuali@Lius-MacBook-Pro huali-test % oc delete machines.machine.openshift.io huliu-aws411ccm-ktgjv-worker-us-east-2a-68lcj
machine.machine.openshift.io "huliu-aws411ccm-ktgjv-worker-us-east-2a-68lcj" deleted
liuhuali@Lius-MacBook-Pro huali-test % oc get machines.machine.openshift.io -o wide
NAME PHASE TYPE REGION ZONE AGE NODE PROVIDERID STATE
huliu-aws411ccm-ktgjv-master-0 Running m6i.xlarge us-east-2 us-east-2a 123m ip-10-0-142-194.us-east-2.compute.internal aws:///us-east-2a/i-05d96395caa887d8a running
huliu-aws411ccm-ktgjv-master-1 Running m6i.xlarge us-east-2 us-east-2b 123m ip-10-0-188-250.us-east-2.compute.internal aws:///us-east-2b/i-062357b65874125d0 running
huliu-aws411ccm-ktgjv-master-2 Running m6i.xlarge us-east-2 us-east-2c 123m ip-10-0-193-79.us-east-2.compute.internal aws:///us-east-2c/i-0a220248387b666a8 running
huliu-aws411ccm-ktgjv-worker-us-east-2a-q6gwt Running m6i.large us-east-2 us-east-2a 11m ip-10-0-128-73.us-east-2.compute.internal aws:///us-east-2a/i-079457d03825b1a8e running
huliu-aws411ccm-ktgjv-worker-us-east-2b-wsdr9 Running m6i.large us-east-2 us-east-2b 119m ip-10-0-190-236.us-east-2.compute.internal aws:///us-east-2b/i-0ff467ae0b64f5e97 running
huliu-aws411ccm-ktgjv-worker-us-east-2c-58cml Running m6i.large us-east-2 us-east-2c 61m ip-10-0-200-145 aws:///us-east-2c/i-00c3c1b8ac9e27704 running
liuhuali@Lius-MacBook-Pro huali-test %
liuhuali@Lius-MacBook-Pro huali-test %
liuhuali@Lius-MacBook-Pro huali-test % oc get node
NAME STATUS ROLES AGE VERSION
ip-10-0-128-73.us-east-2.compute.internal Ready worker 8m25s v1.23.3+69213f8
ip-10-0-142-194.us-east-2.compute.internal Ready master 122m v1.23.3+69213f8
ip-10-0-188-250.us-east-2.compute.internal Ready master 123m v1.23.3+69213f8
ip-10-0-190-236.us-east-2.compute.internal Ready worker 117m v1.23.3+69213f8
ip-10-0-193-79.us-east-2.compute.internal Ready master 122m v1.23.3+69213f8
Case2:
Repeat the above steps but change step 2 to 'install a cluster without feature gate'
'oc get node' can return the node newly created;
'oc get machine -o wide' can see all nodes with AWS DNS suffix
liuhuali@Lius-MacBook-Pro huali-test % oc delete machine huliu-aws411org-n9znk-worker-us-east-2c-g6png
machine.machine.openshift.io "huliu-aws411org-n9znk-worker-us-east-2c-g6png" deleted
liuhuali@Lius-MacBook-Pro huali-test % oc get node
NAME STATUS ROLES AGE VERSION
ip-10-0-143-96.us-east-2.compute.internal Ready worker 64m v1.23.3+69213f8
ip-10-0-158-115.us-east-2.compute.internal Ready master 69m v1.23.3+69213f8
ip-10-0-161-97.us-east-2.compute.internal Ready worker 64m v1.23.3+69213f8
ip-10-0-183-83.us-east-2.compute.internal Ready master 67m v1.23.3+69213f8
ip-10-0-207-171.us-east-2.compute.internal Ready master 68m v1.23.3+69213f8
ip-10-0-211-24.us-east-2.compute.internal Ready worker 4m28s v1.23.3+69213f8
liuhuali@Lius-MacBook-Pro huali-test % oc get machine -o wide
NAME PHASE TYPE REGION ZONE AGE NODE PROVIDERID STATE
huliu-aws411org-n9znk-master-0 Running m6i.xlarge us-east-2 us-east-2a 69m ip-10-0-158-115.us-east-2.compute.internal aws:///us-east-2a/i-015848c984c27f208 running
huliu-aws411org-n9znk-master-1 Running m6i.xlarge us-east-2 us-east-2b 69m ip-10-0-183-83.us-east-2.compute.internal aws:///us-east-2b/i-05d5e5f3928e1f0cc running
huliu-aws411org-n9znk-master-2 Running m6i.xlarge us-east-2 us-east-2c 69m ip-10-0-207-171.us-east-2.compute.internal aws:///us-east-2c/i-0b3e2d804b47bb401 running
huliu-aws411org-n9znk-worker-us-east-2a-6595z Running m6i.large us-east-2 us-east-2a 66m ip-10-0-143-96.us-east-2.compute.internal aws:///us-east-2a/i-0caef8be0317db87c running
huliu-aws411org-n9znk-worker-us-east-2b-nnprl Running m6i.large us-east-2 us-east-2b 66m ip-10-0-161-97.us-east-2.compute.internal aws:///us-east-2b/i-0315216923c19c195 running
huliu-aws411org-n9znk-worker-us-east-2c-kfpwh Running m6i.large us-east-2 us-east-2c 8m23s ip-10-0-211-24.us-east-2.compute.internal aws:///us-east-2c/i-09981802d2b381bdf running
Then enable feature gate
liuhuali@Lius-MacBook-Pro huali-test % oc edit featuregate cluster
featuregate.config.openshift.io/cluster edited
Wait more than four hours, the node still NotReady
liuhuali@Lius-MacBook-Pro huali-test % oc get node
NAME STATUS ROLES AGE VERSION
ip-10-0-143-96.us-east-2.compute.internal Ready worker 6h11m v1.23.3+69213f8
ip-10-0-158-115.us-east-2.compute.internal Ready master 6h16m v1.23.3+69213f8
ip-10-0-161-97.us-east-2.compute.internal Ready worker 6h11m v1.23.3+69213f8
ip-10-0-183-83.us-east-2.compute.internal Ready master 6h14m v1.23.3+69213f8
ip-10-0-207-171.us-east-2.compute.internal Ready master 6h15m v1.23.3+69213f8
ip-10-0-211-24.us-east-2.compute.internal NotReady,SchedulingDisabled worker 5h12m v1.23.3+69213f8
liuhuali@Lius-MacBook-Pro huali-test % oc get machines.machine.openshift.io -o wide
NAME PHASE TYPE REGION ZONE AGE NODE PROVIDERID STATE
huliu-aws411org-n9znk-master-0 Running m6i.xlarge us-east-2 us-east-2a 6h20m ip-10-0-158-115.us-east-2.compute.internal aws:///us-east-2a/i-015848c984c27f208 running
huliu-aws411org-n9znk-master-1 Running m6i.xlarge us-east-2 us-east-2b 6h20m ip-10-0-183-83.us-east-2.compute.internal aws:///us-east-2b/i-05d5e5f3928e1f0cc running
huliu-aws411org-n9znk-master-2 Running m6i.xlarge us-east-2 us-east-2c 6h20m ip-10-0-207-171.us-east-2.compute.internal aws:///us-east-2c/i-0b3e2d804b47bb401 running
huliu-aws411org-n9znk-worker-us-east-2a-6595z Running m6i.large us-east-2 us-east-2a 6h17m ip-10-0-143-96.us-east-2.compute.internal aws:///us-east-2a/i-0caef8be0317db87c running
huliu-aws411org-n9znk-worker-us-east-2b-nnprl Running m6i.large us-east-2 us-east-2b 6h17m ip-10-0-161-97.us-east-2.compute.internal aws:///us-east-2b/i-0315216923c19c195 running
huliu-aws411org-n9znk-worker-us-east-2c-kfpwh Running m6i.large us-east-2 us-east-2c 5h19m ip-10-0-211-24.us-east-2.compute.internal aws:///us-east-2c/i-09981802d2b381bdf running
- links to
-
RHBA-2023:4603 OpenShift Container Platform 4.13.z bug fix update