Loading...

XML

Word

Printable

Type: Bug
Resolution: Done
Priority: Undefined
Fix Version/s: Logging 5.6.0
Affects Version/s: Logging 5.5.0
Component/s: Log Storage, Loki
Labels:
- devel_ack+
- no-rn

Blocked:
False
Blocked Reason:
None
Ready:
False
Epic Link:
Loki - Logs-based Alerts
Docs QE Status:
NEW
Feature Link:
OBSDA-115 - Create alerting rules based on logs
QE Status:
VERIFIED

Sprint:
Log Storage - Sprint 226

SFDC Cases Links:
SFDC Cases Open:
SFDC Cases Counter:

LokiRuler pods are showing Evaluating rule failure after alerting and recording rules are created for App and Infra tenants.

Error:

level=warn ts=2022-07-12T23:39:47.953761624Z caller=pool.go:184 msg="removing ingester failing healthcheck" addr=10.131.0.25:9095 reason="rpc error: code = Unavailable desc = connection closed before server preface received" level=warn ts=2022-07-12T23:39:47.954127337Z caller=pool.go:184 msg="removing ingester failing healthcheck" addr=10.129.2.15:9095 reason="rpc error: code = Unavailable desc = connection closed before server preface received" level=info ts=2022-07-12T23:41:02.380901027Z caller=metrics.go:122 component=ruler org_id=application latency=fast query="(count_over_time({kubernetes_namespace_name=\"my-user-workload\", kubernetes_pod_name=~\"centos-logtest.*\"}[2m]) > 10)" query_type=metric range_type=instant length=0s step=0s duration=673.313Âµs status=500 limit=0 returned_lines=0 throughput=0B total_bytes=0B queue_time=0s subqueries=1 level=warn ts=2022-07-12T23:41:02.380948952Z caller=manager.go:610 user=application group=HighAppLogsToLoki2m msg="Evaluating rule failed" rule="record: loki:operator:applogs:rate2m\nexpr: (count_over_time({kubernetes_namespace_name=\"my-user-workload\", kubernetes_pod_name=~\"centos-logtest.*\"}[2m])\n > 10)\n" err="rpc error: code = Unavailable desc = connection closed before server preface received" level=warn ts=2022-07-12T23:41:02.952428483Z caller=pool.go:184 msg="removing ingester failing healthcheck" addr=10.129.2.15:9095 reason="rpc error: code = Unavailable desc = connection closed before server preface received" level=warn ts=2022-07-12T23:41:02.952475863Z caller=pool.go:184 msg="removing ingester failing healthcheck" addr=10.131.0.25:9095 reason="rpc error: code = Unavailable desc = connection closed before server preface received"

Steps to reproduce:

1) Deploy LokiOperator and create bucket secret and LokiStack CR

LokiStack CR:

spec:
  size: 1x.small
  storage:
    schemas:
    - version: v12
      effectiveDate: 2022-06-01
    secret:
      name: test
      type: s3
  storageClassName: gp2
  tenants:
    mode: openshift-logging
  rules:
    enabled: true
    selector:
      matchLabels:
        openshift.io/cluster-monitoring: "true"
    namespaceSelector:
      matchLabels:
        openshift.io/cluster-monitoring: "true"

2) Validate that LokiRuler pods are up and running.

[kbharti@cube hack]$ oc get pods  | grep ruler
lokistack-dev-ruler-0                           1/1     Running   0          3h47m
lokistack-dev-ruler-1                           1/1     Running   0          3h47m

3) Deploy Application in my-user-workload namespace with openshift.io/cluster-monitoring: 'true' label on namespace.

4) Create Alerting and recording rules

Application alerting and recording rules: http://pastebin.test.redhat.com/1062953

Infra alerting and recording rules: http://pastebin.test.redhat.com/1062954

5) Validate Loki ruler config map for data.

6) Check logs on Loki Ruler pods.

Expected Result: Rules should create successfully and ruler pods should restart without any error

Actual Result: Error is seen on Loki Ruler pods.

Assignee:: Mohamed-Amine Bouqsimi (Inactive)

Reporter:: Kabir Bharti

QA Contact:: Kabir Bharti

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Created:: 2022/07/12 11:52 PM

Updated:: 2023/01/23 9:35 AM

Resolved:: 2022/10/19 12:34 PM

Details

Description

Attachments

Easy Agile Planning Poker

Activity

People

Dates