Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-31035

Enable swap as day-2 operation doesn't work if node shutdown and start again

XMLWordPrintable

    • Important
    • No
    • 3
    • OCPEDGE Sprint 256
    • 1
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      We (OpenShift Local) team is trying out the swap enablement for the node which is part of Tech preview as per document https://docs.openshift.com/container-platform/4.15/nodes/nodes/nodes-nodes-managing.html#nodes-nodes-swap-memory_nodes-nodes-managing but looks like when we follow those steps it works as expected but as soon as node shutdown (manually) and started again it just fails to consume the swap as it should be.

      Version-Release number of selected component (if applicable):

       - 4.15.3

      How reproducible:

       - Create a single node openshift cluster and perform following operation to enable the swap partition.
      
      $ oc edit featuregates <= update the spec with following to enable swapgate
        spec:
          customNoUpgrade:
            enabled:
            - NodeSwap
            - BuildCSIVolumes
          featureSet: CustomNoUpgrade
      
      $ cat <<EOF | oc apply -f -
      apiVersion: machineconfiguration.openshift.io/v1
      kind: MachineConfig
      metadata:
        labels:
          machineconfiguration.openshift.io/role: master
        name: 99-kernel-swapcount-arg
      spec:
        config:
        ignition:
          version: 3.4.0
        kernelArguments:
          - swapaccount=1
      EOF
      
      $ cat <<EOF | oc apply -f -
      apiVersion: machineconfiguration.openshift.io/v1
      kind: MachineConfig
      metadata:
        labels:
          machineconfiguration.openshift.io/role: master
        name: 99-openshift-machineconfig-master-swap-service
      spec:
        config:
          ignition:
            version: 3.4.0
          systemd:
            units:
            - contents: |
                [Unit]
                Description=Create 4GB swap partition
                Before=var-crc-swapfile1.swap
                
                [Service]
                Type=oneshot
                RemainAfterExit=yes
                ExecStart=/usr/bin/mkdir -p /var/crc
                ExecStart=/usr/bin/fallocate -l 4096m /var/crc/swapfile1
                ExecStart=/usr/bin/chmod 600 /var/crc/swapfile1
                ExecStart=/usr/sbin/mkswap /var/crc/swapfile1
                
                [Install]
                WantedBy=multi-user.target
              enabled: true
              name: swap-partition.service
            - contents: |
                [Unit]
                Description=Turn on the swap
                
                [Swap]
                What=/var/crc/swapfile1
                
                [Install]
                WantedBy=multi-user.target
              enabled: true
              name: var-crc-swapfile1.swap
      EOF
      $ cat <<EOF | oc apply -f -
      apiVersion: machineconfiguration.openshift.io/v1
      kind: KubeletConfig
      metadata:
        name: swap-config
      spec:
        machineConfigPoolSelector:
          matchLabels:
            pools.operator.machineconfiguration.openshift.io/master: ""
        kubeletConfig:
          failSwapOn: false 
          memorySwap:
            swapBehavior: LimitedSwap
      EOF
      
      
      Wait till all the operator come back again.

      Actual results:

       After restart swap is not consumed as it was before.

      Expected results:

        Restart shouldn't effect how swap is consumed

      Additional info:

          I am going to add the must gather log for just after day-2 operation and then after the reboot one.
      
      Node description (just after day-2 operation)
      ```
      $ oc describe node 
      ---
      Allocated resources:
        (Total limits may be over 100 percent, i.e., overcommitted.)
        Resource           Requests       Limits
        --------           --------       ------
        cpu                2161m (56%)    0 (0%)
        memory             9111Mi (107%)  0 (0%)   <= You can see it is overcommited
        ephemeral-storage  0 (0%)         0 (0%)
        hugepages-1Gi      0 (0%)         0 (0%)
        hugepages-2Mi      0 (0%)         0 (0%)
      ```
      
      Node description (just after reboot)
      ```
      $ oc describe node
      Allocated resources:
        (Total limits may be over 100 percent, i.e., overcommitted.)
        Resource           Requests      Limits
        --------           --------      ------
        cpu                2091m (55%)   0 (0%)
        memory             8451Mi (99%)  0 (0%) <= not overcommited and it stay same
        ephemeral-storage  0 (0%)        0 (0%)
        hugepages-1Gi      0 (0%)        0 (0%)
        hugepages-2Mi      0 (0%)        0 (0%)
      ```

       

       

       

              bzamalut@redhat.com Bulat Zamalutdinov
              prkumar@redhat.com Praveen Kumar
              Pedro Jose Amoedo Martinez Pedro Jose Amoedo Martinez
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Created:
                Updated:
                Resolved: