Uploaded image for project: 'OpenShift Logging'
  1. OpenShift Logging
  2. LOG-4275

[release-5.7] Vector pods going into a panic state

XMLWordPrintable

    • False
    • None
    • False
    • NEW
    • VERIFIED
    • Hide
      Prior to this fix, the vector log collector occasionally panicked with the following error message in its log:
      thread 'vector-worker' panicked at 'all branches are disabled and there is no else branch', src/kubernetes/reflector.rs:26:9
      After this fix, the vector log collector no longer does that.
      Show
      Prior to this fix, the vector log collector occasionally panicked with the following error message in its log: thread 'vector-worker' panicked at 'all branches are disabled and there is no else branch', src/kubernetes/reflector.rs:26:9 After this fix, the vector log collector no longer does that.
    • Bug Fix
    • High
    • Log Collection - Sprint 238, Log Collection - Sprint 239
    • Important

      Description of problem:

      Randomly, vector pods allocated in the worker nodes go into a panic state and become unresponsive.
      After restarting the pods manually, the logs are successfully processed again.

      The log in question that we can see in Vector pods is:

      2023-06-26T03:14:23.503804380Z 2023-06-26T03:14:23.487362Z ERROR vector::topology::builder: msg="Healthcheck: Failed Reason." error=Failed to make HTTP(S) request: error trying to connect: dns error: failed to lookup address information: Name or service not known component_kind="sink" component_type="elasticsearch" component_id=external_elasticsearch_ecp component_name=external_elasticsearch_ecp
      2023-06-26T03:14:28.566100553Z 2023-06-26T03:14:28.566022Z ERROR kube_client::client::builder: failed with error error trying to connect: dns error: failed to lookup address information: Name or service not known
      2023-06-26T03:14:28.566100553Z thread 'vector-worker' panicked at 'all branches are disabled and there is no else branch', src/kubernetes/reflector.rs:26:9
      2023-06-26T03:14:28.566152028Z note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
      2023-06-26T03:14:33.615156912Z 2023-06-26T03:14:33.615108Z ERROR kube_client::client::builder: failed with error error trying to connect: dns error: failed to lookup address information: Name or service not known
      2023-06-26T03:14:33.615156912Z thread 'vector-worker' panicked at 'all branches are disabled and there is no else branch', src/kubernetes/reflector.rs:26:9
      2023-06-26T03:14:33.844423262Z 2023-06-26T03:14:33.844348Z ERROR sink{component_kind="sink" component_id=default component_type=elasticsearch component_name=default}: vector::internal_events::http_client: HTTP error. error=error trying to connect: dns error: failed to lookup address information: Name or service not known error_type="request_failed" stage="processing"
      2023-06-26T03:14:33.844477233Z 2023-06-26T03:14:33.844433Z  WARN sink{component_kind="sink" component_id=default component_type=elasticsearch component_name=default}: vector::sinks::util::retries: Retrying after error. error=Failed to make HTTP(S) request: error trying to connect: dns error: failed to lookup address information: Name or service not known
      

      Version-Release number of selected component (if applicable):

      RHOL 5.6.7
      RHOL 5.7.2

      How reproducible:

      -

      Actual results:

      Vector pods go into a panic state and losing logs.

      Expected results:

      Vector pods working properly

      Additional info:

      After doing some checks, I have found this bug --> https://github.com/vectordotdev/vector/issues/12245

      And it was solved in Vector release 0.21.0--> https://vector.dev/releases/0.21.0/#known-issues

      RHOL 5.6 and 5.7 uses v0.20.1 release If I am not wrong -->https://github.com/ViaQ/vector/tree/release-5.6 and https://github.com/ViaQ/vector/tree/release-5.7

      For next RHOL 5.8 I can see that the Vector release will be v0.28.1 -->https://github.com/ViaQ/vector/tree/release-5.8

            syedriko_sub@redhat.com Sergey Yedrikov
            acandelp Adrian Candel
            Qiaoling Tang Qiaoling Tang
            Votes:
            2 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated:
              Resolved: