Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-128600

Allow starting protected services with MS_PRIVATE mount option

Linking RHIVOS CVEs to...Migration: Automation ...RHELPRIO AssignedTeam ...SWIFT: POC ConversionSync from "Extern...XMLWordPrintable

    • Icon: Story Story
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • None
    • systemd
    • None
    • None
    • 1
    • rhel-systemd
    • None
    • False
    • False
    • Hide

      None

      Show
      None
    • None
    • Systemd sprint_6
    • None
    • None
    • Unspecified
    • Unspecified
    • Unspecified
    • None

      When a service unit has Protect* directives, usually a new namespace is created and root file system mounted with MS_SLAVE:

      2001 int setup_namespace(
       :
      2038                 char **error_path) {
       :
      2432         /* Remount / as SLAVE so that nothing now mounted in the namespace
      2433          * shows up in the parent */
      2434         if (mount(NULL, "/", NULL, MS_SLAVE|MS_REC, NULL) < 0) {
      2435                 r = log_debug_errno(errno, "Failed to remount '/' as SLAVE: %m");
      2436                 goto finish;
      2437         }
       :
      

      This leads to having a lot of mounts be used on the system and may lead to reaching the upper limit (fs.mount-max = 100000 by default).

      If limit is reached, other services doing mounts internally, such as systemd-tmpfiles-clean.service to setup the credentials, fail to start:

      [...] systemd[1]: Starting Cleanup of Temporary Directories...
      [...] systemd[209076]: systemd-tmpfiles-clean.service: Failed to set up credentials: Protocol error
      [...] systemd[209076]: systemd-tmpfiles-clean.service: Failed at step CREDENTIALS spawning systemd-tmpfiles: Protoco>
      [...] systemd[1]: systemd-tmpfiles-clean.service: Main process exited, code=exited, status=243/CREDENTIALS
      [...] systemd[1]: systemd-tmpfiles-clean.service: Failed with result 'exit-code'.
      [...] systemd[1]: Failed to start Cleanup of Temporary Directories.
      

      An strace shows that spawning of the service fails when a mount is done:

      1400559 [...] mount(NULL, "/proc/self/fd/3", NULL, MS_REC|MS_SLAVE, NULL) = 0 <...>
       :
      1400559 [...] mount("/dev/shm", "/proc/self/fd/3", NULL, MS_MOVE, NULL) = -1 ENOSPC (No space left on device) <...>
      

      I believe that most of the time, it should be possible to reduce the number of used mounts on the system by not mounting with MS_SLAVE, but MS_PRIVATE instead.
      For example this could apply to most of the protected services, which do not care about seeing propagation of the mounts inside (e.g. chronyd, rsyslogd and more).

      Very probably using MS_PRIVATE should even be the default, and only specific services requiring the mount propagation should be configured with MS_SLAVE.

              msekleta@redhat.com Michal Sekletar
              rhn-support-rmetrich Renaud Métrich
              systemd maint mailing list systemd maint mailing list
              Frantisek Sumsal Frantisek Sumsal
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

                Created:
                Updated: