Uploaded image for project: 'OpenShift Logging'
  1. OpenShift Logging
  2. LOG-8083

NetworkPolicy generated by the Loki Operator does not permit egress to the Openstack Swift bucket

XMLWordPrintable

    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • False
    • NEW
    • NEW
    • Bug Fix
    • Logging - Sprint 280

      Description:
      The loki-allow-bucket-egress NetworkPolicy generated by the LokiStack controller, it seems, does not allow egress to the OpenStack swift platform, causing ingestion and flush failures. The default policies created by the operator should be including configuration for OpenStack Swift since it is a supported storage backend

      Error:

      level=error ts=2025-11-06T18:44:57.935983386Z caller=flush.go:261 

      component=ingester loop=1 org_id=infrastructure msg="failed to flush" retries=5 err="failed to flush chunks: store put chunk: Timeout when reading or writing data, num_chunks: 1, labels: {__stream_shard__=\"1\", k8s_node_name=\"kbhartiosp2-gtrlv-master-0\", kubernetes_host=\"kbhartiosp2-gtrlv-master-0\", log_type=\"infrastructure\", openshift_log_type=\"infrastructure\"}"

      Compactor and Ingester go into crashloop with error:

      level=error ts=2025-11-06T18:50:00.645923306Z caller=log.go:223 msg="error running loki" err="failed to create object client: Timeout when reading or writing data\nerror initialising module: compactor\ngithub.com/grafana/dskit/modules.(*Manager).initModule\n\t/opt/app-root/src/loki/vendor/github.com/grafana/dskit/modules/modules.go:138\ngithub.com/grafana/dskit/modules.(*Manager).InitModuleServices\n\t/opt/app-root/src/loki/vendor/github.com/grafana/dskit/modules/modules.go:108\ngithub.com/grafana/loki/v3/pkg/loki.(*Loki).Run\n\t/opt/app-root/src/loki/pkg/loki/loki.go:531\nmain.main\n\t/opt/app-root/src/loki/cmd/loki/main.go:129\nruntime.main\n\t/usr/lib/golang/src/runtime/proc.go:283\nruntime.goexit\n\t/usr/lib/golang/src/runtime/asm_amd64.s:1700"

       

      level=error ts=2025-11-06T18:49:27.67935856Z caller=log.go:223 msg="error running loki" err="Timeout when reading or writing data\nerror creating object client\ngithub.com/grafana/loki/v3/pkg/storage.(*LokiStore).chunkClientForPeriod\n\t/opt/app-root/src/loki/pkg/storage/store.go:248\ngithub.com/grafana/loki/v3/pkg/storage.(*LokiStore).init\n\t/opt/app-root/src/loki/pkg/storage/store.go:197\ngithub.com/grafana/loki/v3/pkg/storage.NewStore\n\t/opt/app-root/src/loki/pkg/storage/store.go:189\ngithub.com/grafana/loki/v3/pkg/loki.(*Loki).initStore\n\t/opt/app-root/src/loki/pkg/loki/modules.go:929\ngithub.com/grafana/dskit/modules.(*Manager).initModule\n\t/opt/app-root/src/loki/vendor/github.com/grafana/dskit/modules/modules.go:136\ngithub.com/grafana/dskit/modules.(*Manager).InitModuleServices\n\t/opt/app-root/src/loki/vendor/github.com/grafana/dskit/modules/modules.go:108\ngithub.com/grafana/loki/v3/pkg/loki.(*Loki).Run\n\t/opt/app-root/src/loki/pkg/loki/loki.go:531\nmain.main\n\t/opt/app-root/src/loki/cmd/lok...

      Loki config:

      common:
        storage:
          swift:
            auth_url: https://rhos-d.infra.prod.upshift.rdu2.redhat.com:13000/v3
            username: ${SWIFT_USERNAME}
            user_domain_name: redhat.com
            user_domain_id: <hidden>
            user_id: <hidden>
            password: ${SWIFT_PASSWORD}
            domain_id: <hidden>
            domain_name: redhat.com
            project_id: <hidden>
            project_name: openshift-qe-jenkins
            project_domain_id: <hidden>
            project_domain_name: redhat.com
            region_name: 
            container_name: logging-loki-74397-kbhartiosp1-nxhnk

      Bucket Egress network policy:

      $oc get networkpolicy loki-74397-loki-allow-bucket-egress -o yaml | yq -e .spec
      egress:
        - ports:
            - port: 13000
              protocol: TCP
      podSelector:
        matchExpressions:
          - key: app.kubernetes.io/name
            operator: In
            values:
              - lokistack
          - key: app.kubernetes.io/component
            operator: In
            values:
              - ingester
              - querier
              - index-gateway
              - compactor
              - ruler
      policyTypes:
        - Egress

      LokiStack CR with networkPolicies enabled: 

      $ oc get lokistack loki-74397 -o yaml | yq -e .spec
      limits:
        ....
      managementState: Managed
      networkPolicies:
        disabled: false
      rules:
        enabled: true
        namespaceSelector:
          matchLabels:
            openshift.io/cluster-monitoring: "true"
        selector:
          matchLabels:
            openshift.io/cluster-monitoring: "true"
      size: 1x.demo
      storage:
        schemas:
          - effectiveDate: "2023-10-15"
            version: v13
        secret:
          name: storage-secret-74397
          type: swift
      storageClassName: standard-csi
      tenants:
        mode: openshift-logging
        openshift:
          adminGroups: []

      Steps to Reproduce:
      a) Deploy Loki Operator v6.4 on Openstack using swift as backend
      b) Deploy with NP disabled. logs can be flushed to swift
      c) Enable NP
      d) Observe timeout errors on ingester, compactor and ingester go into CrashLoop with timeout error.

      Version: loki-operator.v6.4.0

      How reproducible: Always

      Expected Result: LokiStack should be able to forward to the Openstack swift storage backend.

      Actual Result: Timeout while forwarding to storage backend.

      Additional Info: Logs can be forwarded to swift when networkPolicies is disabled.

              Unassigned Unassigned
              rhn-support-kbharti Kabir Bharti
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated: