Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-22706

Targeting a namespace with a second AdminPolicyBasedExternalRoute isn't blocked/reported as an error.

XMLWordPrintable

    • No
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      While reviewing doc on the prerelease version here we state:

      "Each namespace that a AdminPolicyBasedExternalRoute CR targets cannot be selected by any other AdminPolicyBasedExternalRoute CR. A namespace cannot have concurrent secondary external gateways."

      Later again on table 1, spec.from's description:

      A namespace can be targeted by only one AdminPolicyBasedExternalRoute CR. If a namespace is selected by more than one AdminPolicyBasedExternalRoute CR, a failed error status occurs on the second and subsequent CRs targeting the same namespace.

      Confirm this sort of happens, only one policy can effectively target a namespace at any given time, however if we try to apply/override the first policy with a second policy targeting the same namespace, we get the usual vanilla  "policy applied" message, in reality a silent failure is masked here. 

      The above behavior is misleading and may confuse customers, as they might think a second policy was indeed applied, owing that no error or "a failed error status" was reported as feedback while applying said policy. Upon checking they'll find out the first policy remains in affect whilst the second policy doesn't induce any of the expected route changes. 

      This isn't a doc bug documenting a missing error notification, but rather unfortunately the doc is correct, it might be the feature lacking blocking/notification powers, else if a failed error is indeed reported i have no idea where/how to view it.

       

      Version-Release number of selected component (if applicable):

      4.14.0-0.nightly-2023-10-04-143709

      How reproducible:

      Every time

      Steps to Reproduce:

      1. Deploy a cluster, create a pod inside a namespace:
      $ oc get pod -n bar -o wide
      NAME     READY   STATUS    RESTARTS   AGE   IP             NODE         NOMINATED NODE   READINESS GATES
      dummy1   1/1     Running   0          7d    10.128.3.229   worker-0-1   <none>           <none>
      
      
      2. Create a simple static route yaml, apply it check resulting routes:
      $cat static_bar1.yaml 
      apiVersion: k8s.ovn.org/v1
      kind: AdminPolicyBasedExternalRoute
      metadata:
        name: first-policy
      spec:
      ## gateway example
        from:
          namespaceSelector:
            matchLabels:
                kubernetes.io/metadata.name: bar
        nextHops:       
          static:
            - ip: "173.20.0.8"
            - ip: "173.20.0.9"
      
      $ oc get adminpolicybasedexternalroutes.k8s.ovn.org 
      NAME            LAST UPDATE   STATUS
      first-policy  
      
      $ POD=$(kubectl get pod -n openshift-ovn-kubernetes --field-selector spec.nodeName=worker-0-1 -o jsonpath='{range .items[*]}{.metadata.name}{"\n"}{end}' | grep ovnkube-node-) ; kubectl exec -ti $POD -n openshift-ovn-kubernetes -c nbdb -- ovn-nbctl lr-route-list GR_worker-0-1
      
      IPv4 Routes
      Route Table <main>:
                    10.128.3.229                173.20.0.8 src-ip ..
                    10.128.3.229                173.20.0.9 src-ip ..
               169.254.169.0/29             169.254.169.4 dst-ip ..
                  10.128.0.0/14                100.64.0.1 dst-ip
                      0.0.0.0/0             192.168.123.1 dst-ip ..
      
      Great the two first lines are added as expected.
      
      3. Now clone the yaml, change policy name and IPs, again apply the policy and recheck resulting routes:
      
      $ cat static_bar2.yaml 
      apiVersion: k8s.ovn.org/v1
      kind: AdminPolicyBasedExternalRoute
      metadata:
        name: second-policy
      spec:
      ## gateway example
        from:
          namespaceSelector:
            matchLabels:
                kubernetes.io/metadata.name: bar
        nextHops:       
          static:
            - ip: "173.30.0.7"
            - ip: "173.30.0.6"
      
      
      $ oc get adminpolicybasedexternalroutes.k8s.ovn.org 
      NAME            LAST UPDATE   STATUS
      first-policy                  
      second-policy                 ^maybe error should show up here?
      
      When we re-check the routes we get same routes as before, notice the IP addresses aren't added or updated, in affect the first policy remains active the second policy does nothing, which is fine but we have to notify the user about the silent "failed" application of the second policy.
      
      IPv4 Routes
      Route Table <main>:
            10.128.3.229                173.20.0.8 ..
            10.128.3.229                173.20.0.9 ..
            169.254.169.0/29             169.254.169.4 
            10.128.0.0/14                100.64.0.1 
            0.0.0.0/0             192.168.123.1 
      
      
      The following steps are just further instigation, around the subject. 
      
      4. Out of shear curiosity, I then deleted the first policy wondering what would happen: 
       
      $ oc delete adminpolicybasedexternalroutes.k8s.ovn.org first-policy 
      adminpolicybasedexternalroute.k8s.ovn.org "first-policy" deleted
      $ oc get adminpolicybasedexternalroutes.k8s.ovn.org 
      NAME            LAST UPDATE   STATUS
      second-policy 
      
      The second policy still doesn't take affect, the route table that we get is the default one, without any policy:
      
      IPv4 Routes
      Route Table <main>:
               169.254.169.0/29             169.254.169.4 ..
                  10.128.0.0/14                100.64.0.1 ..
                      0.0.0.0/0             192.168.123.1 ..
      
      5. Reapplying the second policy again, had no affect still see the default route table. 
      
      6. Delete the second policy, reapply it:
      $ oc delete adminpolicybasedexternalroutes.k8s.ovn.org second-policy 
      adminpolicybasedexternalroute.k8s.ovn.org "second-policy" deleted
      
      $ oc get adminpolicybasedexternalroutes.k8s.ovn.org 
      No resources found
      
      Reapply second policy again, this now works the policy affects the route table as expected:
      IPv4 Routes
      Route Table <main>:
                   10.128.3.229                173.30.0.6 ..
                   10.128.3.229                173.30.0.7 ..
                  169.254.169.0/29             169.254.169.4 
                 10.128.0.0/14                100.64.0.1 
                  0.0.0.0/0             192.168.123.1 
       
      The last steps 4-6 effectively confirm only a single policy can target a namespace concurrently, the first policy attached is the only one that is relevant, in case you wish to switch to a new policy, you have to delete the first policy prior to enabling the next policy, maybe add code/logic that would somehow enforce this better.

      Actual results:

      We can "successfully" apply a second policy to the same namespace, which is already targeted by one policy, no error status is reported/raised.

      Expected results:

      As we state that one a single policy can be set per namespace, I'd expect the system to either block any further policy attachments, or at a minimum  raise a red flag/warn the user about double booking of policies on same target namespace.

      Additional info:

       

              jgil@redhat.com Jordi Gil
              tshefi@redhat.com Tzach Shefi
              Anurag Saxena Anurag Saxena
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated:
                Resolved: