Reproduction of https://bugzilla.redhat.com/show_bug.cgi?id=1816475
escription of problem:
Request to change default whitelist capability handling in Security Context Capability (SCC) definitions.
Version-Release number of selected component (if applicable):
OCP 4.3
How reproducible:
Always
Steps to Reproduce:
1. Log on to an OCP cluster.
2. Drop ALL capabilities in a Container SecurityContext
3. Add back in the specific capabilities required for the workload: assign a _native_ SecurityContextConstraint to the ServiceAccount that the Pod will be scheduled using.
4. Drop ALL capabilities from the SecurityContext and add the specific ones required back in:
containers:
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
add: - KILL
- CHOWN
5. Schedule the Pod for deployment using:
$ oc apply -f
6. The Pod fails to schedule, because all OpenShift native SCCs (excluding privileged) do not have any capabilities listed in the `allowedCapabilities` field of the SCC
7. In the above example, I would need to Drop every capability that is not `KILL` or `CHOWN` instead.
Actual results:
The Pod fails to schedule, because all OpenShift native SCCs (excluding privileged) do not have any capabilities listed in the `allowedCapabilities` field of the SCC.
Expected results:
The `allowedCapabilities` object should be logically populated, the pod should then schedule as expected after applying any allowed customisation, using `$ oc apply -f`.
—
I would expect for SecurityContextConstraints and the Pod Admissions controller to have some way of determining what the default capabilities are for the kubelet of the Node the Pod is to be scheduled on. This would give the ability to Drop ALL capabilities and add back in only the ones that are needed by the workload.
Alternatively, all of the namespace-safe capabilities \could be added to the default SCCs shipped with Red Hat OpenShift, so that the drop functionality can be achieved with the platform defaults.
Benefits of Enhancement
Allowing the defaults to be added into the white list of allowed capabilities by default would give the following benefits:
1. Without Drop ALL, if a new capability is added to the default set, then this is a threat vector for the workload because more capabilities have been opened up in the container
2. Dropping ALL capabilities and adding back in the ones needed is more explicit. It is the most declarative way to define the security context of a container with respect to the capabilities it has
3. If a default capability was to change, it would be automatically picked up and enforced by the platform, instead of having to change the container spec definitions for each Pod in the workload and shipping a new version of a product
Dropping all capabilities and adding back only those required to run the workload is the most prescriptive way to declare what capabilities your container needs. For the principle of least privilege to be followed as a best practice, then this format of declaring container capabilities must be followed. Otherwise the risk is run that new capabilities could be opened up in the container, and this exposes a threat vector.
IMPACT:
This impacts XX Helm operators currently in service.