Uploaded image for project: 'OPCT - OpenShift Provider Compatibility Tool'
  1. OPCT - OpenShift Provider Compatibility Tool
  2. OPCT-31

[backend][sonobuoy] Server is not setting securityContext correctly by default

XMLWordPrintable

    • False
    • Hide

      None

      Show
      None
    • False

      The Sonobuoy server uses by default the "SecurityContextMode" with value "nonroot". This mode will set a couple of Kubernetes securityContext statements to the podSpec.

      In k8s 1.24, OCP 4.11, those statements are reporting errors on the Aggregator logs, preventing the aggregator to patch the pods.

       

      $ KUBECONFIG=$PWD/.opct-410t411/clusters/opct-410t411/auth/kubeconfig oc  logs -n openshift-provider-certification sonobuoy |grep error= |head -n1
      time="2023-01-18T14:27:32Z" level=info msg="couldn't annotate sonobuoy pod" 
      error="couldn't patch pod annotation: 
      pods \"sonobuoy\" is forbidden: 
      unable to validate against any security context constraint: 
      [provider restricted-v2: .spec.securityContext.fsGroup: Invalid value: []int64{2000}: 2000 is not an allowed group, 
      spec.containers[0].securityContext.runAsUser: Invalid value: 1000: must be in the ranges: [1000650000, 1000659999], 
      provider restricted: .spec.securityContext.fsGroup: Invalid value: []int64{2000}: 2000 is not an allowed group, 
      provider machine-api-termination-handler: .spec.securityContext.fsGroup: Invalid value: []int64{2000}: 2000 is not an allowed group, 
      spec.volumes[0]: Invalid value: \"configMap\": configMap volumes are not allowed to be used, 
      spec.volumes[1]: Invalid value: \"configMap\": configMap volumes are not allowed to be used, 
      spec.volumes[2]: Invalid value: \"emptyDir\": emptyDir volumes are not allowed to be used, 
      spec.volumes[3]: Invalid value: \"projected\": projected volumes are not allowed to be used, 
      provider hostnetwork-v2: .spec.securityContext.fsGroup: Invalid value: []int64{2000}: 2000 is not an allowed group, 
      provider hostnetwork: .spec.securityContext.fsGroup: Invalid value: []int64{2000}: 2000 is not an allowed group, 
      provider hostaccess: .spec.securityContext.fsGroup: Invalid value: []int64{2000}: 2000 is not an allowed group]"
      
      $ KUBECONFIG=$PWD/.opct-410t411/clusters/opct-410t411/auth/kubeconfig oc  logs -n openshift-provider-certification sonobuoy |grep error= |wc -l
      405 

      Reference

      We got it while implementing the upgrade feature, that the aggregator stop working.

      My theory is that after upgrading the cluster to 4.11 (4.10->4.11), the plugin jobs, previously created with valid securityContext, is not yet valid, and should be updated, otherwise the KAS(pod security apis) are refusing this statements.

      When running the OPCT with  SecurityContextMode=none[1], the Sonobuoy will not set any value for SecurityContext on pods, and will work normally.

       

      aggConfig.SecurityContextMode = "none" 

      The solution for OPCT is defined on the PR: https://github.com/redhat-openshift-ecosystem/provider-certification-tool/pull/39

       

      The long-term solution on the upstream/sonobuoy should be evaluated on 1.24+ - there is a opened issue to track it: https://github.com/vmware-tanzu/sonobuoy/issues/1858 

      ENGINEERING REFERENCES

       

              Unassigned Unassigned
              rhn-support-mrbraga Marco Braga
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated: