Uploaded image for project: 'OpenShift Logging'
  1. OpenShift Logging
  2. LOG-6817

Can not find some type of logs in lokistack when use Otel and AtLeastOnce

XMLWordPrintable

    • Incidents & Support
    • False
    • Hide

      None

      Show
      None
    • False
    • NEW
    • NEW
    • Bug Fix
    • Important

      Description of problem:

      Can not find some type of logs (infrastructure, openshiftAPI) in lokistack.  If we forward logs to lokistack with size 1x.extra-small/1x.small using dataModel: Otel and deliveryMode:AtLeastOnce. 

      After remove deliveryMode:AtLeastOnce from CLF.  The logs can be sent to lokistack.

      No such issue when I use 1x.demo

      Collector pods Logs

      2025-03-06T07:27:30.663687Z  WARN sink{component_kind="sink" component_id=output_default_lokistack_infrastructure component_type=http}: vector::sinks::util::retries: Retrying after response. reason=too many requests internal_log_rate_limit=true

      logging-loki-distributor Logs

      level=error ts=2025-03-06T14:52:19.923415169Z caller=manager.go:50 component=distributor path=write msg="write operation failed" details="Ingestion rate limit exceeded for user audit (limit: 52428800 bytes/sec) while attempting to ingest '3250' lines totaling '7351984' bytes, reduce log volume or contact your Loki administrator to see if the limit can be increased" org_id=audit
      level=error ts=2025-03-06T14:52:25.124807476Z caller=manager.go:50 component=distributor path=write msg="write operation failed" details="Ingestion rate limit exceeded for user infrastructure (limit: 78643200 bytes/sec) while attempting to ingest '5500' lines totaling '7922434' bytes, reduce log volume or contact your Loki administrator to see if the limit can be increased" org_id=infrastructure
      level=error ts=2025-03-06T14:52:25.53828126Z caller=manager.go:50 component=distributor path=write msg="write operation failed" details="Ingestion rate limit exceeded for user infrastructure (limit: 78643200 bytes/sec) while attempting to ingest '5250' lines totaling '6967630' bytes, reduce log volume or contact your Loki administrator to see if the limit can be increased" org_id=infrastructure
      level=error ts=2025-03-06T14:52:26.190720287Z caller=manager.go:50 component=distributor path=write msg="write operation failed" details="Ingestion rate limit exceeded for user infrastructure (limit: 78643200 bytes/sec) while attempting to ingest '5500' lines totaling '8193159' bytes, reduce log volume or contact your Loki administrator to see if the limit can be increased" org_id=infrastructure
      level=error ts=2025-03-06T14:52:27.958212872Z caller=manager.go:50 component=distributor path=write msg="write operation failed" details="Ingestion rate limit exceeded for user infrastructure (limit: 78643200 bytes/sec) while attempting to ingest '5250' lines totaling '6967630' bytes, reduce log volume or contact your Loki administrator to see if the limit can be increased" org_id=infrastructure
      level=error ts=2025-03-06T14:52:28.119245037Z caller=manager.go:50 component=distributor path=write msg="write operation failed" details="Ingestion rate limit exceeded for user audit (limit: 52428800 bytes/sec) while attempting to ingest '8800' lines totaling '7692093' bytes, reduce log volume or contact your Loki administrator to see if the limit can be increased" org_id=audit
      level=error ts=2025-03-06T14:52:29.402175339Z caller=manager.go:50 component=distributor path=write msg="write operation failed" details="Ingestion rate limit exceeded for user infrastructure (limit: 78643200 bytes/sec) while attempting to ingest '5500' lines totaling '8193159' bytes, reduce log volume or contact your Loki administrator to see if the limit can be increased" org_id=infrastructure

      How reproducible:

      Always

      Steps to Reproduce:

      1. Deploy lokistack using size 1x.extra-small
      apiVersion: loki.grafana.com/v1
      kind: LokiStack
      metadata:
        name: logging-loki
      spec:
        managementState: Managed
        size: 1x.demo
        storage:
          schemas:
            - effectiveDate: '2023-10-15'
              version: v13
          secret:
            name: s3-secret
            type: s3
        storageClassName: gp3-csi
        tenants:
          mode: openshift-logging
      1.  Forward logs to lokistack with Otel and AtLeastOnce
      cat << EOF | oc apply -f -
      apiVersion: observability.openshift.io/v1
      kind: ClusterLogForwarder
      metadata:
        name: collector
        namespace: openshift-logging
        annotations:
          observability.openshift.io/tech-preview-otlp-output: "enabled"
      spec:
        collector:
          nodeSelector:
            kubernetes.io/os: "linux"
          resources: {}
        outputs:
        - name: default-lokistack
          lokiStack:
            dataModel: Otel
            authentication:
              token:
                from: serviceAccount
            target:
              name: logging-loki
              namespace: openshift-logging
            tuning:
              compression: none
              deliveryMode: AtLeastOnce
          tls:
            ca:
              key: service-ca.crt
              configMapName: openshift-service-ca.crt
          type: lokiStack
        pipelines:
        - name: default-before
          inputRefs:
          - infrastructure
          - application
          - audit
          outputRefs:
          - default-lokistack
        serviceAccount:
          name: logcollector
      EOF
      1. Qurey logs from lokistack
        query={log_type="infrastructure", log_source="node"}
        query={log_type="infrastructure", log_source="container"}
        query={log_type="application"}
        query={log_source="kubeAPI"}
        query={log_source="openshiftAPI"}

      Actual results:

      Can not find the infrastructure and openshiftAPI logs in 40 minutes.

      - QUery Infra journal logs ---
      
      logcli -o raw --bearer-token="sha256~al_J2on1V6UJlg1sNF_ZBc6Q7tIXlRKsi3Y-Eq2RGdw" --tls-skip-verify --addr="https://logging-loki-openshift-logging.apps.anli416.qe.devcluster.openshift.com/api/logs/v1/infrastructure" query --limit=1
      {log_type="infrastructure", log_source="node"}
      2025/04/28 22:48:43 https://logging-loki-openshift-logging.apps.anli416.qe.devcluster.openshift.com/api/logs/v1/infrastructure/loki/api/v1/query_range?direction=BACKWARD&end=1745894923678700649&limit=1&query=%7Blog_type%3D%22infrastructure%22%2C+log_source%3D%22node%22%7D&start=1745891323678700649
      - QUery Infra Container---
      logcli -o raw --bearer-token="sha256~al_J2on1V6UJlg1sNF_ZBc6Q7tIXlRKsi3Y-Eq2RGdw" --tls-skip-verify --addr="https://logging-loki-openshift-logging.apps.anli416.qe.devcluster.openshift.com/api/logs/v1/infrastructure" query --limit=1
      {log_type="infrastructure", log_source="container"}
      2025/04/28 22:48:43 https://logging-loki-openshift-logging.apps.anli416.qe.devcluster.openshift.com/api/logs/v1/infrastructure/loki/api/v1/query_range?direction=BACKWARD&end=1745894923956730620&limit=1&query=%7Blog_type%3D%22infrastructure%22%2C+log_source%3D%22container%22%7D&start=1745891323956730620
      
       

      Additional info:

      After remove deliveryMode: AtLeastOnce, the logs appears.

              Unassigned Unassigned
              rhn-support-anli Anping Li
              Jeevan Darapu
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated: