Uploaded image for project: 'OpenShift Logging'
  1. OpenShift Logging
  2. LOG-7775

Loki operator 6.3.1 not spinning up the loki-gateway pods due to new Rego policy requirements

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Duplicate
    • Icon: Undefined Undefined
    • None
    • Logging 6.3.1
    • Log Storage
    • None
    • Incidents & Support
    • False
    • Hide

      None

      Show
      None
    • False
    • NEW
    • NEW
    • Bug Fix

      Description of problem:

      When upgrading Loki operator from 6.3.0 to 6.3.1, a new `replicaset` is created, however, this new `replicaset` has no valid `pods` in `Running` state. We can observe 3 replicas for the `loki-gateway` (2 from the old and 1 from the new one): 

      $ oc get pods
      NAME                                                READY   STATUS             RESTARTS   AGE
      logging-loki-gateway-6c6b7d4dcf-8f5gt               1/1     Running            0          6d
      logging-loki-gateway-6c6b7d4dcf-pbr5v               1/1     Running            0          6d
      logging-loki-gateway-7d6ffd68bd-tkkxl               0/1     CrashLoopBackOff   1352       4d
      
      $ oc get replicaset
      NAME                                          DESIRED   CURRENT   READY   AGE
      logging-loki-gateway-6c6b7d4dcf               2         2         2       10d
      logging-loki-gateway-7d6ffd68bd               1         1         0       4d
      
      $ oc get deployments
      NAME                               READY   UP-TO-DATE   AVAILABLE   AGE
      logging-loki-gateway               2/2     1            2           17d
      $ oc get deployment logging-loki-gateway -o yaml 
      ...
      spec:
      ...
              volumeMounts:
      ...
              - mountPath: /etc/lokistack-gateway/lokistack-gateway.rego
                name: lokistack-gateway
                readOnly: true
                subPath: lokistack-gateway.rego
      ...
            volumes:
            - configMap:
                defaultMode: 420
                name: logging-loki-gateway
              name: lokistack-gateway
      ...
      status:
        availableReplicas: 2
        conditions:
        - lastTransitionTime: "2025-09-17T19:08:39Z"
          lastUpdateTime: "2025-09-17T19:08:39Z"
          message: ReplicaSet "logging-loki-gateway-7d6ffd68bd" has timed out progressing.
          reason: ProgressDeadlineExceeded
          status: "False"
          type: Progressing
        observedGeneration: 23
        readyReplicas: 2
        replicas: 3
        unavailableReplicas: 1
        updatedReplicas: 1

      The new `pod` shows the following error:

      2025/09/17 19:40:16 (version=, branch=, revision=unknown) level=warn name=lokistack-gateway ts=2025-09-17T19:40:16.047739618Z caller=main.go:417 msg="skipping invalid tenant" tenant=platform-non-prod err="failed to create in-process OPA authorizer: failed to prepare OPA query: 5 errors occurred during loading:\n/etc/lokistack-gateway/lokistack-gateway.rego:9: rego_parse_error: `if` keyword is required before rule body\n/etc/lokistack-gateway/lokistack-gateway.rego:18: rego_parse_error: `if` keyword is required before rule body\n/etc/lokistack-gateway/lokistack-gateway.rego:18: rego_parse_error: `contains` keyword is required for partial set rules\n/etc/lokistack-gateway/lokistack-gateway.rego:22: rego_parse_error: `if` keyword is required before rule body\n/etc/lokistack-gateway/lokistack-gateway.rego:22: rego_parse_error: `contains` keyword is required for partial set rules\n/etc/lokistack-gateway/lokistack-gateway.rego

      Looking at the `configmap` we can see the current Rego policy:

      $ oc get cm logging-loki-gateway
      NAME                   DATA   AGE
      logging-loki-gateway   2      17d
      $ echo <lokistack-gateway.rego> | base64 -d
      package lokistack
       
      import input
      import data.roles
      import data.roleBindings
      default allow = false
      allow {
        some roleNames
        roleNames = roleBindings[matched_role_binding[_]].roles
        roles[i].name == roleNames[_]
        roles[i].resources[_] = input.resource
        roles[i].permissions[_] = input.permission
        roles[i].tenants[_] = input.tenant
      }
      matched_role_binding[i] {
        roleBindings[i].subjects[_] == {"name": input.subject, "kind": "user"}
      }
      matched_role_binding[i] {
        roleBindings[i].subjects[_] == {"name": input.groups[_], "kind": "group"}
      }
      

       

      I have conducted some researches, and it seems that in the new OPA version [1], the 'if' and 'contains' keywords have been introduced and they are required in the Rego policies.

      [1] https://www.openpolicyagent.org/docs/v0-upgrade#enforce-use-of-if-and-contains-keywords-in-rule-head-declarations 

      Version-Release number of selected component (if applicable):

      Loki operator 6.3.1

      How reproducible:

      Upgrading Loki operator to version 6.3.1

      Steps to Reproduce:

      1. Install Loki operator version 6.3.0
      2. Upgrade the operator to version 6.3.1

      Actual results:

      `logging-loki-gateway` pod in `CrashLoopBackOff` state.

      Expected results:

      `logging-loki-gateway` pod in `Running` state.

      Additional info:

              rojacob@redhat.com Robert Jacob
              rhn-support-faldana Fabio Aldana
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: