-
Bug
-
Resolution: Done-Errata
-
Critical
-
4.14.z
-
+
-
Important
-
Yes
-
CLOUD Sprint 247, CLOUD Sprint 248
-
2
-
Proposed
-
False
-
-
-
Bug Fix
-
Done
-
-
-
-
Description of problem:
In ROSA/OCP 4.14.z, attaching AmazonEC2ContainerRegistryReadOnly policy to the worker nodes (in ROSA's case, this was attached to the ManagedOpenShift-Worker-Role, which is assigned by the installer to all the worker nodes), has no effect on ECR Image pull. User gets an authentication error. Attaching the policy ideally should avoid the need to provide an image-pull-secret. However, the error is resolved only if the user also provides an image-pull-secret. This is proven to work correctly in 4.12.z. Seems something has changed in the recent OCP versions.
Version-Release number of selected component (if applicable):
4.14.2 (ROSA)
How reproducible:
The issue is reproducible using the below steps.
Steps to Reproduce:
1. Create a deployment in ROSA or OCP on AWS, pointing at a private ECR repository 2. The image pulling will fail with Error: ErrImagePull & authentication required errors 3.
Actual results:
The image pull fails with "Error: ErrImagePull" & "authentication required" errors. However, the image pull is successful only if the user provides an image-pull-secret to the deployment.
Expected results:
The image should be pulled successfully by virtue of the ECR-read-only policy attached to the worker node role; without needing an image-pull-secret.
Additional info:
In other words:
in OCP 4.13 (and below) if a user adds the ECR:* permissions to the worker instance profile, then the user can specify ECR images and authentication of the worker node to ECR is done using the instance profile. In 4.14 this no longer works.
It is not sufficient as an alternative, to provide a pull secret in a deployment because AWS rotates ECR tokens every 12 hours. That is not a viable solution for customers that until OCP 4.13, did not have to rotate pull secrets constantly.
The experience in 4.14 should be the same as in 4.13 with ECR.
The current AWS policy that's used is this one: `arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly`
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "ecr:GetAuthorizationToken", "ecr:BatchCheckLayerAvailability", "ecr:GetDownloadUrlForLayer", "ecr:GetRepositoryPolicy", "ecr:DescribeRepositories", "ecr:ListImages", "ecr:DescribeImages", "ecr:BatchGetImage", "ecr:GetLifecyclePolicy", "ecr:GetLifecyclePolicyPreview", "ecr:ListTagsForResource", "ecr:DescribeImageScanFindings" ], "Resource": "*" } ] }
- blocks
-
OCPBUGS-27486 ECR Image pull fails in-spite of attaching AmazonEC2ContainerRegistryReadOnly policy to the worker nodes.
- Closed
- is blocked by
-
OCPCLOUD-2434 Impact ECR Image pull fails in-spite of attaching AmazonEC2ContainerRegistryReadOnly policy to the worker nodes
- Closed
- is cloned by
-
OCPBUGS-27486 ECR Image pull fails in-spite of attaching AmazonEC2ContainerRegistryReadOnly policy to the worker nodes.
- Closed
- is related to
-
OCPBUGS-26602 Windows Nodes unable to pull from ECR
- Closed
- relates to
-
OCPCLOUD-2379 Support external cloud authentication providers
- Closed
- links to
-
RHEA-2024:0041 OpenShift Container Platform 4.16.z bug fix update