-
Story
-
Resolution: Done-Errata
-
Normal
-
None
-
3
-
False
-
None
-
False
-
ACM-2707 - ACM Gatekeeper Enhancements
-
-
-
GRC Sprint 2023-16, GRC Sprint 2023-17, GRC Sprint 2023-18, GRC Sprint 2023-19, GRC Sprint 2023-20, GRC Sprint 2023-21, GRC Sprint 2023-22, GRC Sprint 2024-03
-
No
Description of problem:
The gatekeeper operator exposes a setting in the CRD under audit named auditFromCache. By default this cache is Disabled. If you set it to Enabled, you break your constraints because the cache requires additional settings in the CRD configs.config.gatekeeper.sh for the sync details.
Through some basic experimentation I validated that enabling the cache but not defining the sync, or not adding the appropriate resources to the sync, will result in an audit that discovers no violations even if a violation exists.
The problem (my opinion here) is that if we expose the ability to enable the cache in the operator we must also expose the ability to configure the cache with the sync details.
Version-Release number of selected component (if applicable):
I tested on the 0.2.6 operator. I did some basic validation on the community 3.11.1 release quickly just to make sure the behavior didn't change in newer releases.
How reproducible:
It's easy to reproduce this.
Steps to Reproduce:
- Deploy your favorite constraint and make sure there's a violation that audit detects
- Enable the cache – the violation will no longer be reported
- Add the sync for the proper resources and the violation will be reported again
Actual results:
This can prevent customers from seeing violations and is not a good experience with the operator.
Expected results:
"it would be nice" if we could figure out what needs to be synced for the customer. Otherwise the customer has to make sure there configuration is setup right in all of their different environments. if they deploy different constraints to different environments this could be cumbersome.
Additional info:
Gatekeeper uses a lot of memory – just always. Watch out – I think we have more to understand here. Also, with the cache disabled gatekeeper uses a lot of CPU. I can understand the desire to use the cache seeing how much CPU is in use (with no caching) – which is likely directed against the api server and processing those results.
Test cases
-
- Gatekeeper with auditFromCache=Automatic create syncOnly config.
- Apply gatekeeper
apiVersion: operator.gatekeeper.sh/v1alpha1 kind: Gatekeeper metadata: name: gatekeeper spec: # Add fields here audit: auditFromCache: Automatic
- Apply gatekeeper
- Gatekeeper with auditFromCache=Automatic create syncOnly config.
-
-
- Create ContraintTemplate
- Create Constraint for any resources (Ingress) such as Ingress, pod and so on
- Check whether Config resource is created
- Check the Config has Ingress under spec.sync.syncOnly
-
apiVersion: config.gatekeeper.sh/v1alpha1 kind: Config metadata: name: config namespace: "gatekeeper-system" spec: sync: syncOnly: - group: "networking.k8s.io" version: "v1" kind: "Ingress"
-
- Deleting Constraint(Ingress) Should remove syncOnly element(Ingress)
- If 2 ContraintTemplates and * each has Constraint(Pod), deleting one of *ConstraintTemplates and related Constraint, syncOnly Ingress should include ingress still
E2E tests
https://github.com/stolostron/gatekeeper-operator/blob/main/test/e2e/case1_audit_from_cache_test.go
- relates to
-
ACM-8899 [Doc]Gatekeeper operator exposes a cache setting but not the sync settings
-
- Closed
-
- links to
-
RHEA-2023:125635 Gatekeeper v3.14.0