Loading...

Type: Bug
Resolution: Cannot Reproduce
Priority: Critical
Fix Version/s: None
Affects Version/s: Logging 5.7.0
Component/s: Log Collection
Labels:
- NeedTestCases
- devel_ack+

Blocked:
False
Blocked Reason:
None
Ready:
False
Docs QE Status:
NEW
QE Status:
NEW
Intelligence Requested:
Market:

Sprint:
Log Collection - Sprint 236, Log Collection - Sprint 237

SFDC Cases Links:
SFDC Cases Counter:
SFDC Cases Open:

Description of problem:

On DevSandbox clusters, the Cluster Logging Operator was upgraded to v5.7.0 and our collector pod fell in CrashLoopBackOff state with the following error:

error[E701]: call to undefined variable┌─ :1:90│1 │ (.kubernetes.namespace_name == "toolchain-host-operator") && (.kubernetes.labels.control-plane == "controller-manager")│ ^^^^^│ ││ undefined variable│ did you mean "false"?│= see language documentation at https://vrl.deverror[E100]: unhandled error┌─ :1:1│1 │ (.kubernetes.namespace_name == "toolchain-host-operator") && (.kubernetes.labels.control-plane == "controller-manager")│ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^│ ││ expression can result in runtime error│ handle the error case to ensure runtime success│

Version-Release number of selected component (if applicable):

v5.7.0

How reproducible:

Steps to Reproduce:

...

Actual results:

Expected results:

Additional info:

Current Cluster Logging resource:

apiVersion: logging.openshift.io/v1
kind: ClusterLogging
metadata:   creationTimestamp: '2021-10-28T12:58:39Z'
  generation: 1
  managedFields:     - apiVersion: logging.openshift.io/v1
      fieldsType: FieldsV1
      fieldsV1:
        'f:spec':
          .: {}
          'f:collection':
            .: {}
            'f:logs':
              .: {}
              'f:fluentd':
                .: {}
                'f:resources':
                  .: {}
                  'f:limits':
                    .: {}
                    'f:memory': {}
                  'f:requests':
                    .: {}
                    'f:cpu': {}
                    'f:memory': {}
              'f:type': {}
          'f:managementState': {}
      manager: sandbox-cli
      operation: Update
      time: '2021-10-28T12:58:39Z'
    - apiVersion: logging.openshift.io/v1
      fieldsType: FieldsV1
      fieldsV1:
        'f:status':
          .: {}
          'f:clusterConditions': {}
          'f:collection':
            .: {}
            'f:logs':
              .: {}
              'f:fluentdStatus':
                .: {}
                'f:daemonSet': {}
                'f:nodes':
                  'f:collector-l6p9l': {}
                  'f:collector-48jvj': {}
                  'f:collector-572k9': {}
                  'f:collector-2w2b6': {}
                  .: {}
                  'f:collector-m9jsz': {}
                  'f:collector-d86l2': {}
                  'f:collector-9gv8x': {}
                  'f:collector-pdwcr': {}
                'f:pods':
                  .: {}
                  'f:failed': {}
                  'f:notReady': {}
                  'f:ready': {}
          'f:conditions': {}
          'f:curation': {}
          'f:logStore': {}
          'f:visualization': {}
      manager: cluster-logging-operator
      operation: Update
      subresource: status
      time: '2023-04-26T07:58:23Z'
  name: instance
  namespace: openshift-logging
  resourceVersion: '2517536636'
  uid: cddd1ccc-9374-4868-b3cc-956d21f49900
spec:   collection:     logs:       fluentd:         resources:           limits:             memory: 736Mi
          requests:             cpu: 100m
            memory: 736Mi
      type: fluentd
    type: vector
  managementState: Managed
status:   collection:     logs:       fluentdStatus:         daemonSet: collector
        nodes:           collector-2w2b6: ip-10-0-248-35.ec2.internal
          collector-48jvj: ip-10-0-188-155.ec2.internal
          collector-572k9: ip-10-0-199-46.ec2.internal
          collector-9gv8x: ip-10-0-204-219.ec2.internal
          collector-d86l2: ip-10-0-231-58.ec2.internal
          collector-l6p9l: ip-10-0-248-164.ec2.internal
          collector-m9jsz: ip-10-0-200-101.ec2.internal
          collector-pdwcr: ip-10-0-255-6.ec2.internal
        pods:           failed: []
          notReady: []
          ready:             - collector-2w2b6
            - collector-48jvj
            - collector-572k9
            - collector-9gv8x
            - collector-d86l2
            - collector-l6p9l
            - collector-m9jsz
            - collector-pdwcr
  conditions:     - lastTransitionTime: '2022-08-18T16:21:42Z'
      status: 'False'
      type: CollectorDeadEnd
    - lastTransitionTime: '2022-08-18T16:21:47Z'
      message: curator is deprecated in favor of defining retention policy
      reason: ResourceDeprecated
      status: 'True'
      type: CuratorRemoved
  curation: {}
  logStore: {}
  visualization: {}

ClusterLogForwarder:

apiVersion: logging.openshift.io/v1
kind: ClusterLogForwarder
metadata:
  name: instance
  namespace: openshift-logging
spec:
  inputs:
    - application:
        namespaces:
          - toolchain-member-operator
        selector:
          matchLabels:
            control-plane: controller-manager
      name: toolchain-member-operator
    - application:
        namespaces:
          - toolchain-member-operator
        selector:
          matchLabels:
            app: member-operator-webhook
      name: toolchain-member-operator-webhook
    - application:
        namespaces:
          - codeready-workspaces-operator
        selector:
          matchLabels:
            app: codeready-operator
      name: codeready-workspaces-operator
    - application:
        namespaces:
          - codeready-workspaces-operator
        selector:
          matchLabels:
            app: codeready
            component: codeready
      name: codeready
  outputs:
    - name: loki
      type: loki
      url: 'http://loki.openshift-customer-monitoring.svc.cluster.local:3100'
  pipelines:
    - inputRefs:
        - toolchain-host-operator
      labels:
        namespace: toolchain-host-operator
      name: toolchain-host-operator-to-loki
      outputRefs:
        - loki
      parse: json
    - inputRefs:
        - registration-service
      labels:
        namespace: toolchain-host-operator
      name: registration-service-to-loki
      outputRefs:
        - loki
      parse: json
    - inputRefs:
        - toolchain-member-operator
      labels:
        namespace: toolchain-member-operator
      name: toolchain-member-operator-to-loki
      outputRefs:
        - loki
      parse: json
    - inputRefs:
        - toolchain-member-operator-webhook
      labels:
        namespace: toolchain-member-operator
      name: toolchain-member-operator-webhook-to-loki
      outputRefs:
        - loki
      parse: json
    - inputRefs:
        - codeready-workspaces-operator
      labels:
        namespace: codeready-workspaces-operator
      name: codeready-workspaces-operator-to-loki
      outputRefs:
        - loki
      parse: json
    - inputRefs:
        - codeready
      labels:
        namespace: codeready-workspaces-operator
      name: codeready-to-loki
      outputRefs:
        - loki
      parse: json
status:
  conditions:
    - lastTransitionTime: '2023-05-10T17:21:58Z'
      message: 'No valid inputs, outputs, or pipelines. Invalid CLF spec.'
      reason: Invalid
      status: 'False'
      type: Ready
  inputs:
...

also, the list of collector pods is not up-to-date with the cluster state

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

image-2023-05-10-16-58-36-061.png
132 kB
2023/05/10 2:58 PM

Details

Description

Description of problem:

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:

Actual results:

Expected results:

Additional info:

Attachments

Attachments

Easy Agile Planning Poker

Activity

People

Dates