-
Epic
-
Resolution: Done
-
Critical
-
None
-
None
-
None
-
KEP/alpha :calculation of ephemeral storage of node when /var/lib/containers is mounted separately
-
BU Product Work
-
False
-
False
-
To Do
-
OCPSTRAT-188 - Split filesystem and make each partition first class citizen for kubelet
-
OCPSTRAT-188Split filesystem and make each partition first class citizen for kubelet
-
0% To Do, 0% In Progress, 100% Done
-
Undefined
Epic Goal
- In day-2 task if /var/lib/containers is mounted separately on different device by using machineconfig then node object is not reflecting the additional amount of storage at Capacity.ephemeral-storage and Allocatable.ephemeral-storage.
- In case of /var/lib/containers is mounted on different device keeping root filesystem (which will include /var/lib/kubelet) separately, to avoid filling /var/lib/containers as well as /var/lib/kubelet, quota applied at project level is not respecting the increase in size at /var/lib/containers and only evicting the pod if the limit for ephemeral-storage is crossed in /var/lib/kubelet.
Why is this important?
- In upstream kubernetes docs of two filesystem sections, I see that having local ephemeral storage on a node with two filesystems is supported.
- This is important in case container writes into some /tmp of container's filesystem which is not mounted as EmptyDir on node then space of /var/lib/containers will be consumed and separate partition may get 100% full if node resource is not monitoring changes and evicting the pods. In such cases, it's not known whether node will go into Not ready if separately mounted /var/lib/containers is 100% full.
Scenarios
- After mounting /var/lib/containers separately by following KCS doc, I tried checking ephemeral storage space of node.
# lsblk
sda4 8:4 0 119.5G 0 part /sysroot
sdb x:x 40G part /var/lib/containers
It is not reflecting in Node resource:
# oc describe node node-name | grep storage ephemeral-storage: 125293548Ki ephemeral-storage: 114396791822
Applied resourcequota for one project for ephemeral-storage and deployed one pod on the same node.
# oc describe quota Name: compute-resources Namespace: testdd Resource Used Hard ---- ---- limits.ephemeral-storage 1Gi 1Gi requests.ephemeral-storage 1Gi 1Gi - mysql-2-2krf4 1/1 Running 0 5m10s 10.129.2.8 node-name <none> <none>
Tried creating 1Gi of file going beyond the limits of quota with dd command within pods emptyDir location on the node:
# cd /var/lib/kubelet/pods/<pod-uid>/volumes/kubernetes.io~empty-dir/mysql-data/ # dd if=/dev/zero of=1g.bin bs=1G count=1 1+0 records in 1+0 records out 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 1.04763 s, 1.0 GB/s # du -h 160K ./#innodb_temp 32K ./mysql 1.6M ./performance_schema 80K ./sys 0 ./sampledb 1.7G .
After this pod got evicted as expected.
mysql-2-2krf4 0/1 Evicted 0 5m49s <none> node-name <none> <none>
To check if pod gets evicted after creating 1G file at/var/lib/containers
On the node find out the rootfilesystem of container:
crictl inspect <container-id> | grep -i "root" -A 2 "root": { "path": "/var/lib/containers/storage/overlay/e5ce4dfe909922ec65dabb86cbc84521d5e0dec21a547d31272330cade09e5af/merged" }
On node:
# cd /var/lib/containers/storage/overlay/e5ce4dfe909922ec65dabb86cbc84521d5e0dec21a547d31272330cade09e5af/merged # ls bin boot dev etc help.1 home lib lib64 lost+found media mnt opt proc root run sbin srv sys tmp usr var # ls tmp/ 11-sample.txt ks-script-1ivkqzo2 ks-script-heymndnb # df /var/lib/containers/ Filesystem 1K-blocks Used Available Use% Mounted on /dev/sdb 41922560 3902184 38020376 10% /var/lib/containers # dd if=/dev/zero of=1g.bin bs=1G count=1 1+0 records in 1+0 records out 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 1.2062 s, 890 MB/s # df /var/lib/containers/ Filesystem 1K-blocks Used Available Use% Mounted on /dev/sdb 41922560 4950760 36971800 12% /var/lib/containers
In this case, pod did not get evicted and stayed in running state without respecting quota limits.
So, from the observation, it seems that the ephemeral-storage for node is not considering combined size of root filesystem and disk size added for /var/lib/containers. Also, limits specified for ephemeral-storage for pod is not considering increase in size of /var/lib/containers and not evicting the pod.
Note: After deletion of node object and adding it back did not make any difference in size of ephemeral-storage.
Acceptance Criteria
- CI - MUST be running successfully with tests automated
- Release Technical Enablement - Provide necessary release enablement details and documents.
Dependencies (internal and external)
Previous Work (Optional):
Open questions::
Done Checklist
- CI - CI is running, tests are automated and merged.
- Release Enablement <link to Feature Enablement Presentation>
- DEV - Upstream code and tests merged: <link to meaningful PR or GitHub Issue>
- DEV - Upstream documentation merged: <link to meaningful PR or GitHub Issue>
- DEV - Downstream build attached to advisory: <link to errata>
- QE - Test plans in Polarion: <link or reference to Polarion>
- QE - Automated tests merged: <link or reference to automated tests>
- DOC - Downstream documentation merged: <link to meaningful PR>
1.
|
Docs Tracker | Closed | Unassigned | ||
2.
|
QE Tracker | Closed | Unassigned | ||
3.
|
TE Tracker | Closed | Derrick Ornelas |