-
Bug
-
Resolution: Done-Errata
-
Major
-
4.16
-
None
Description of problem:
In a 4.16.0-ec.1 cluster, scaling up a MachineSet with publicIP:true fails with:
$ oc -n openshift-machine-api get -o json machines.machine.openshift.io | jq -r '.items[] | select(.status.phase == "Failed") | .status.providerStatus.conditions[].message' | sort | uniq -c 1 googleapi: Error 403: Required 'compute.subnetworks.useExternalIp' permission for 'projects/openshift-gce-devel-ci-2/regions/us-central1/subnetworks/ci-ln-q4d8y8t-72292-msmgw-worker-subnet', forbidden
Version-Release number of selected component
Seen in 4.16.0-ec.1. Not noticed in 4.15.0-ec.3. Fix likely needs a backport to 4.15 to catch up with OCPBUGS-26406.
How reproducible
Seen in the wild in a cluster after updating from 4.15.0-ec.3 to 4.16.0-ec.1. Reproduced in Cluster Bot on the first attempt, so likely very reproducible.
Steps to Reproduce
launch 4.16.0-ec.1 gcp Cluster Bot cluster (logs).
$ oc adm upgrade Cluster version is 4.16.0-ec.1 Upstream: https://api.integration.openshift.com/api/upgrades_info/graph Channel: candidate-4.16 (available channels: candidate-4.16) No updates available. You may still upgrade to a specific release image with --to-image or wait for new updates to be available. $ oc -n openshift-machine-api get machinesets NAME DESIRED CURRENT READY AVAILABLE AGE ci-ln-q4d8y8t-72292-msmgw-worker-a 1 1 1 1 60m ci-ln-q4d8y8t-72292-msmgw-worker-b 1 1 1 1 60m ci-ln-q4d8y8t-72292-msmgw-worker-c 1 1 1 1 60m ci-ln-q4d8y8t-72292-msmgw-worker-f 0 0 60m $ oc -n openshift-machine-api get -o json machinesets | jq -c '.items[].spec.template.spec.providerSpec.value.networkInterfaces' | sort | uniq -c 4 [{"network":"ci-ln-q4d8y8t-72292-msmgw-network","subnetwork":"ci-ln-q4d8y8t-72292-msmgw-worker-subnet"}] $ oc -n openshift-machine-api edit machineset ci-ln-q4d8y8t-72292-msmgw-worker-f # add publicIP $ oc -n openshift-machine-api get -o json machineset ci-ln-q4d8y8t-72292-msmgw-worker-f | jq -c '.spec.template.spec.providerSpec.value.networkInterfaces' [{"network":"ci-ln-q4d8y8t-72292-msmgw-network","publicIP":true,"subnetwork":"ci-ln-q4d8y8t-72292-msmgw-worker-subnet"}] $ oc -n openshift-machine-api scale --replicas 1 machineset ci-ln-q4d8y8t-72292-msmgw-worker-f $ sleep 300 $ oc -n openshift-machine-api get -o json machines.machine.openshift.io | jq -r '.items[] | select(.status.phase == "Failed") | .status.providerStatus.conditions[].message' | sort | uniq -c
Actual results
1 googleapi: Error 403: Required 'compute.subnetworks.useExternalIp' permission for 'projects/openshift-gce-devel-ci-2/regions/us-central1/subnetworks/ci-ln-q4d8y8t-72292-msmgw-worker-subnet', forbidden
Expected results
Successfully created machines.
Additional info
I would expect the CredentialsRequest to ask for this permission, but it doesn't seem to. The old roles/compute.admin includes it, and it probably just needs to be added explicitly. Not clear how many other permissions might also need explicit listing.
- blocks
-
OCPBUGS-27405 GCP machine-API provider permissions should support publicIP
- Closed
- is cloned by
-
OCPBUGS-27405 GCP machine-API provider permissions should support publicIP
- Closed
- links to
-
RHEA-2024:0041 OpenShift Container Platform 4.16.z bug fix update