Uploaded image for project: 'Container Tools'
  1. Container Tools
  2. RUN-2479

[containers/podman] Fails to create rootless netns

XMLWordPrintable

    • 1
    • False
    • Hide

      None

      Show
      None
    • False
    • rhel-container-tools
    • RUN 266

      [2816256047] Upstream Reporter: GeorgFleig
      Upstream issue status: Closed
      Upstream description:

      Issue Description

      I'm running podman inside a a Virtualbox VM (Rocky Linux) on my Ubuntu host system.

      7 containers are running. From time to time, when I boot up the VM again, the containers fail to start (they are running in rootless mode and started using Quadlet).

      The logs indicate that something fails with setting up the netns. The containers go through a loop of restarts by systemd, until eventually another error comes up regarding IP address allocation (looks very similar to https://github.com/containers/podman/issues/18615, which is supposedly fixed in 4.8, while I am running 5.2.2).

      Steps to reproduce the issue

      Unfortunately I don't have any idea how to reproduce this. Podman runs without problems for a while, also VM reboots and power-offs are no issue until all of a sudden things go wrong again. I wish I could provide some precise steps.

      Describe the results you received

      Systemd service (generated by Quadlet) fails to start:

      Jan 22 16:47:03 vbox systemd[694]: Starting traefik.service...
      Jan 22 16:47:04 vbox podman[1357]: 2025-01-22 16:47:04.145663341 +0000 UTC m=+0.201975522 container create e5d4df364d473fc917ee87137554c78ea69b032b8c3d81cc32c0f028823f966e (image=dockerhub.internal/traefik:3.1, name=traef>
      Jan 22 16:47:04 vbox podman[1357]: 2025-01-22 16:47:04.09747131 +0000 UTC m=+0.153783507 image pull 075808f3fdf72baa7b647b63631bf5fee7d143164049ebfc40a54d9f238d4b83 dockerhub.internal/traefik:3.1
      Jan 22 16:47:04 vbox podman[1357]: 2025-01-22 16:47:04.278834145 +0000 UTC m=+0.335146331 container remove e5d4df364d473fc917ee87137554c78ea69b032b8c3d81cc32c0f028823f966e (image=dockerhub.internal/traefik:3.1, name=traef>
      Jan 22 16:47:04 vbox traefik[1357]: Error: rootless netns: create netns: open /tmp/containers-user-1002/containers/networks/rootless-netns/rootless-netns: file exists
      Jan 22 16:47:04 vbox systemd[694]: traefik.service: Main process exited, code=exited, status=126/n/a
      Jan 22 16:47:04 vbox systemd[694]: traefik.service: Failed with result 'exit-code'.
      Jan 22 16:47:04 vbox systemd[694]: Failed to start traefik.service.
      Jan 22 16:47:04 vbox systemd[694]: traefik.service: Scheduled restart job, restart counter is at 1.
      Jan 22 16:47:04 vbox systemd[694]: Stopped traefik.service.

      The file indeed exists, as well as the corresponding pid file. No process is running though with this pid:

      [pulp@vbox rootless-netns]$ ls -la
      total 12
      drwx------ 3 pulp pulp 106 Dec 16 11:23 .
      drwx------ 4 pulp pulp  84 Jan 23 16:50 ..
      -rw------- 1 pulp pulp   1 Dec 16 11:24 ref-count
      -rw-r--r-- 1 pulp pulp  60 Dec 16 11:23 resolv.conf
      -rw------- 1 pulp pulp   0 Dec 16 11:23 rootless-netns
      -rw------- 1 pulp pulp   6 Dec 16 11:23 rootless-netns-conn.pid
      drwx------ 4 pulp pulp  33 Dec 16 11:23 run
      [pulp@vbox rootless-netns]$ cat rootless-netns-conn.pid 
      16522
      [pulp@vbox rootless-netns]$ ps aux | grep 16522
      pulp       36794  0.0  0.0   6408  2176 pts/1    S+   10:43   0:00 grep --color=auto 16522

      Then a loop of restarts with the same error follows. At some point the IP address pool is exhausted and the errors look like this:

      Jan 22 16:52:11 vbox systemd[694]: Starting traefik.service...
      Jan 22 16:52:11 vbox podman[59059]: 2025-01-23 02:55:11.622213283 +0000 UTC m=+0.041797034 container create 9381bfa69bc1755035aa1942ed4884aeabdd4b70fa3ee48d5958daa50f147096 (image=dockerhub.internal/traefik:3.3, name=trae>
      Jan 22 16:52:11 vbox podman[59059]: 2025-01-23 02:55:11.670531125 +0000 UTC m=+0.090114884 container remove 9381bfa69bc1755035aa1942ed4884aeabdd4b70fa3ee48d5958daa50f147096 (image=dockerhub.internal/traefik:3.3, name=trae>
      Jan 22 16:52:11 vbox podman[59059]: 2025-01-23 02:55:11.607151849 +0000 UTC m=+0.026735633 image pull 88eafdd76c933a76798a389d994b4fdd6b5edb89d702aae10c4350ecaa3febb9 dockerhub.internal/traefik:3.3
      Jan 22 16:52:11 vbox traefik[59059]: Error: IPAM error: failed to find free IP in range: 10.89.0.1 - 10.89.0.254
      Jan 22 16:52:11 vbox systemd[694]: traefik.service: Main process exited, code=exited, status=126/n/a
      Jan 22 16:52:11 vbox systemd[694]: traefik.service: Failed with result 'exit-code'.
      Jan 22 16:52:11 vbox systemd[694]: Failed to start traefik.service.
      Jan 22 16:52:11 vbox systemd[694]: traefik.service: Scheduled restart job, restart counter is at 37.
      Jan 22 16:52:11 vbox systemd[694]: Stopped traefik.service.

      Which seems similar to what is described in https://github.com/containers/podman/issues/18615 and was supposedly fixed in 4.8. Yet I see the same error message with 5.2.2.

      Once I delete rootless-netns and rootless-netns.pid (as mentioned in the first error) and restart the service, the container starts without problems.

      Describe the results you expected

      • Rootless netns parts are cleaned up properly in case they failed previously
      • IP address pool is not exhausted when containers fail to start

      podman info output

      host:   arch: amd64
        buildahVersion: 1.37.5
        cgroupControllers:   - memory
        - pids
        cgroupManager: systemd
        cgroupVersion: v2
        conmon:     package: conmon-2.1.12-1.el9.x86_64
          path: /usr/bin/conmon
          version: 'conmon version 2.1.12, commit: 5859d6167f22954414ce804d3f2ae9cf6208f929'
        cpuUtilization:     idlePercent: 99.79
          systemPercent: 0.1
          userPercent: 0.11
        cpus: 2
        databaseBackend: sqlite
        distribution:     distribution: rocky
          version: "9.5"
        eventLogger: journald
        freeLocks: 2012
        hostname: vbox
        idMappings:     gidmap:     - container_id: 0
            host_id: 1002
            size: 1
          - container_id: 1
            host_id: 231072
            size: 65536
          uidmap:     - container_id: 0
            host_id: 1002
            size: 1
          - container_id: 1
            host_id: 231072
            size: 65536
        kernel: 5.14.0-503.15.1.el9_5.x86_64
        linkmode: dynamic
        logDriver: journald
        memFree: 5903810560
        memTotal: 8057950208
        networkBackend: netavark
        networkBackendInfo:     backend: netavark
          dns:       package: aardvark-dns-1.12.1-1.el9.x86_64
            path: /usr/libexec/podman/aardvark-dns
            version: aardvark-dns 1.12.1
          package: netavark-1.12.2-1.el9.x86_64
          path: /usr/libexec/podman/netavark
          version: netavark 1.12.2
        ociRuntime:     name: crun
          package: crun-1.16.1-1.el9.x86_64
          path: /usr/bin/crun
          version: |-
            crun version 1.16.1
            commit: afa829ca0122bd5e1d67f1f38e6cc348027e3c32
            rundir: /run/user/1002/crun
            spec: 1.0.0
            +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
        os: linux
        pasta:     executable: /usr/bin/pasta
          package: passt-0^20240806.gee36266-2.el9.x86_64
          version: |
            pasta 0^20240806.gee36266-2.el9.x86_64
            Copyright Red Hat
            GNU General Public License, version 2 or later
              <https://www.gnu.org/licenses/old-licenses/gpl-2.0.html>
            This is free software: you are free to change and redistribute it.
            There is NO WARRANTY, to the extent permitted by law.
        remoteSocket:     exists: false
          path: /run/user/1002/podman/podman.sock
        rootlessNetworkCmd: pasta
        security:     apparmorEnabled: false
          capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
          rootless: true
          seccompEnabled: true
          seccompProfilePath: /usr/share/containers/seccomp.json
          selinuxEnabled: false
        serviceIsRemote: false
        slirp4netns:     executable: /usr/bin/slirp4netns
          package: slirp4netns-1.3.1-1.el9.x86_64
          version: |-
            slirp4netns version 1.3.1
            commit: e5e368c4f5db6ae75c2fce786e31eef9da6bf236
            libslirp: 4.4.0
            SLIRP_CONFIG_VERSION_MAX: 3
            libseccomp: 2.5.2
        swapFree: 2147479552
        swapTotal: 2147479552
        uptime: 20h 40m 45.00s (Approximately 0.83 days)
        variant: ""
      plugins:   authorization: null
        log:   - k8s-file
        - none
        - passthrough
        - journald
        network:   - bridge
        - macvlan
        - ipvlan
        volume:   - local
      registries:   search:   - registry.access.redhat.com
        - registry.redhat.io
        - docker.io
      store:   configFile: /home/pulp/.config/containers/storage.conf
        containerStore:     number: 7
          paused: 0
          running: 7
          stopped: 0
        graphDriverName: overlay
        graphOptions: {}
        graphRoot: /mnt/container_storage/pulp
        graphRootAllocated: 52517371904
        graphRootUsed: 7091998720
        graphStatus:     Backing Filesystem: extfs
          Native Overlay Diff: "true"
          Supports d_type: "true"
          Supports shifting: "false"
          Supports volatile: "true"
          Using metacopy: "false"
        imageCopyTmpDir: /var/tmp
        imageStore:     number: 12
        runRoot: /tmp/containers-user-1002/containers
        transientStore: false
        volumePath: /mnt/container_storage/pulp/volumes
      version:   APIVersion: 5.2.2
        Built: 1731414899
        BuiltTime: Tue Nov 12 12:34:59 2024
        GitCommit: ""
        GoVersion: go1.22.7 (Red Hat 1.22.7-2.el9_5)
        Os: linux
        OsArch: linux/amd64
        Version: 5.2.2
      

      Podman in a container

      No

      Privileged Or Rootless

      Rootless

      Upstream Latest Release

      No

      Additional environment details

      Additional environment details

      Additional information

      No response


      Upstream URL: https://github.com/containers/podman/issues/25144

              pholzing@redhat.com Paul Holzinger
              upstream-sync Upstream Sync
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated:
                Resolved: