Description of problem:
Linux uid incorrect mapping when running upload container for the Job created by the must-gather-operator Job.
Version-Release number of selected component (if applicable):
Reproducible on 4.19 OpenShift, should be the same behaviour on other versions of OpenShift as well. Operator version, built locally from either: https://github.com/swghosh/must-gather-operator/commit/96486ff188daf5a2ab4156183dc106b89190ab5c or master (https://github.com/openshift/must-gather-operator/tree/master)
How reproducible:
Always
Steps to Reproduce:
1. On an OpenShift 4.19 cluster, install the must-gather-operator using steps listed in https://github.com/openshift/must-gather-operator/pull/246/files
2. Create a MustGather CR to perform must-gather collection and upload,
$ oc create -f - << EOF
apiVersion: v1
kind: ServiceAccount
metadata:
name: mg-admin
namespace: must-gather-operator
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: mg-admin-cluster-admin-binding
subjects:
- kind: ServiceAccount
name: mg-admin
namespace: must-gather-operator
roleRef:
kind: ClusterRole
name: cluster-admin
apiGroup: rbac.authorization.k8s.io
---
apiVersion: v1
kind: Secret
metadata:
name: sftp-access-rh-creds
namespace: must-gather-operator
type: Opaque
stringData:
username: some-username
password: a-password
---
apiVersion: managed.openshift.io/v1alpha1
kind: MustGather
metadata:
name: example
namespace: must-gather-operator
spec:
caseID: '12345678'
caseManagementAccountSecretRef:
name: sftp-access-rh-creds
serviceAccountRef:
name: mg-admin
EOF
3. Wait for the k8s Job to be created by the operator, then watch the logs for the pod that gets created by the Job. Once the Pod finishes the first collect container, watch logs for the upload container to spot linux uid issues.
Actual results:
# from the upload container logs, No user exists for uid 1000740000 Connection closed
Expected results:
The upload container should not have uid issue, and sftp should work.
Additional info:
https://github.com/openshift/must-gather-operator/blob/master/build/bin/user_setup the script is likely the cause for this issue, which is triggered in operator's Dockerfile.
More info found upon debugging,
# running a shell inside the upload container sh-4.4$ ssh sftp.access.redhat.com No user exists for uid 1000740000 sh-4.4$ bash bash-4.4$ export SSHPASS=xyz bash-4.4$ sshpass -e sftp -o BatchMode=no -o StrictHostKeyChecking=no -b - rh-ee-smuley@sftp.access.redhat.com No user exists for uid 1000740000 Connection closed bash-4.4$ bash-4.4$ ssh sftp.access.redhat.com No user exists for uid 1000740000 bash-4.4$ id uid=1000740000 gid=0(root) groups=0(root),1000740000 bash-4.4$ whoami whoami: cannot find name for user ID 1000740000 bash-4.4$ getent passwd 1000740000 # should return an output when user exists.
while running a fresh pod with oc debug does not face the issue as the uid gets mapped correctly.
λ oc debug -n must-gather-operator --image=quay.io/rh-ee-smuley/mg-op-test:july24 Warning: would violate PodSecurity "restricted:latest": allowPrivilegeEscalation != false (container "debug" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "debug" must set securityContext.capabilities.drop=["ALL"]), runAsNonRoot != true (pod or container "debug" must set securityContext.runAsNonRoot=true), seccompProfile (pod or container "debug" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost") Starting pod/image-debug-chvjp ... Pod IP: 10.131.0.23 If you don't see a command prompt, try pressing enter. sh-4.4$ export SSHPASS=xyz sh-4.4$ sshpass -e sftp ${SFTP_OPTIONS} - ${username}@${FTP_HOST} hostname contains invalid characters Connection closed sh-4.4$ sshpass -e sftp -o BatchMode=no -o StrictHostKeyChecking=no -b - rh-ee-smuley@sftp.access.redhat.com ssh: connect to host sftp.access.redhat.com port 22: Connection timed out Connection closed Connection closed. sh-4.4$ id uid=1001(mustgather) gid=0(root) groups=0(root) sh-4.4$ getent passwd 1001 # expected non-empty mustgather:x:1001:0::/home/mustgather:/bin/bash
- is triggering
-
OAPE-296 [Sprint 275] Swarup's PR reviews tracker
-
- Closed
-
- links to