-
Bug
-
Resolution: Done
-
Major
-
None
-
None
Description of problem:
The PyTorch CI pipeline fails when different GitLab runners (running under different user accounts) attempt to create directories in the shared temporary storage location /tmp/pytorch-ci-shared/. This results in permission denied errors and pipeline failures.
Root Cause:
Multiple GitLab runners registered under different users (rrathaur, gitlab-runner, root) are trying to write to the same shared directory path, causing permission conflicts.
Steps to Reproduce:
1. Run a GitLab CI pipeline on a runner registered under user A (e.g., rrathaur) 2. The pipeline creates /tmp/pytorch-ci-shared/ owned by user A 3. Run another pipeline on a runner registered under user B (e.g., gitlab-runner) 4. Pipeline fails when trying to create subdirectories with mkdir: cannot create directory '/tmp/pytorch-ci-shared/13106808': Permission denied
Actual results:
$ mkdir -p ${SHARED_DIR}
mkdir: cannot create directory '/tmp/pytorch-ci-shared/13106808': Permission denied
ERROR: Job failed: exit status 1
Expected results:
All GitLab runners should be able to create and access shared directories regardless of which user account they're running under.
Error Details:
Running with gitlab-runner 18.3.1 (5a021a1c) on intel-eaglestream-spr-16.khw.eng.rdu2.dc.redhat.com yhkyQxg2S Executing "step_script" stage of the job script $ echo "Create shared directory" $ mkdir -p ${SHARED_DIR} mkdir: cannot create directory '/tmp/pytorch-ci-shared/13106808': Permission denied ERROR: Job failed: exit status 1