Uploaded image for project: 'OpenShift Logging'
  1. OpenShift Logging
  2. LOG-2665

[Logging 5.5] Sometimes collector fails to push logs to Elasticsearch cluster

XMLWordPrintable

    • False
    • None
    • False
    • NEW
    • VERIFIED

      Description of problem:
      Sometimes collector fails to deliver logs with the following error message:

      2022-05-31 06:49:49 +0000 [warn]: [default] failed to flush the buffer. retry_times=0 next_retry_time=2022-05-31 06:49:51 +0000 chunk="5e0492072bc0d961b9065db156fba970" error_class=Fluent::Plugin::ElasticsearchOutput::RecoverableRequestFailure error="could not push logs to Elasticsearch cluster ({:host=>\"elasticsearch\", :port=>9200, :scheme=>\"https\"}): Couldn't connect to server"
        2022-05-31 06:49:49 +0000 [warn]: /usr/local/share/gems/gems/fluent-plugin-elasticsearch-5.2.1/lib/fluent/plugin/out_elasticsearch.rb:1139:in `rescue in send_bulk'
        2022-05-31 06:49:49 +0000 [warn]: /usr/local/share/gems/gems/fluent-plugin-elasticsearch-5.2.1/lib/fluent/plugin/out_elasticsearch.rb:1101:in `send_bulk'
        2022-05-31 06:49:49 +0000 [warn]: /usr/local/share/gems/gems/fluent-plugin-elasticsearch-5.2.1/lib/fluent/plugin/out_elasticsearch.rb:879:in `block in write'
        2022-05-31 06:49:49 +0000 [warn]: /usr/local/share/gems/gems/fluent-plugin-elasticsearch-5.2.1/lib/fluent/plugin/out_elasticsearch.rb:878:in `each'
        2022-05-31 06:49:49 +0000 [warn]: /usr/local/share/gems/gems/fluent-plugin-elasticsearch-5.2.1/lib/fluent/plugin/out_elasticsearch.rb:878:in `write'
        2022-05-31 06:49:49 +0000 [warn]: /usr/local/share/gems/gems/fluentd-1.14.5/lib/fluent/plugin/output.rb:1179:in `try_flush'
        2022-05-31 06:49:49 +0000 [warn]: /usr/local/share/gems/gems/fluentd-1.14.5/lib/fluent/plugin/output.rb:1495:in `flush_thread_run'
        2022-05-31 06:49:49 +0000 [warn]: /usr/local/share/gems/gems/fluentd-1.14.5/lib/fluent/plugin/output.rb:499:in `block (2 levels) in start'
        2022-05-31 06:49:49 +0000 [warn]: /usr/local/share/gems/gems/fluentd-1.14.5/lib/fluent/plugin_helper/thread.rb:78:in `block in thread_create'
      2022-05-31 06:49:49 +0000 [warn]: [default] failed to flush the buffer. retry_times=0 next_retry_time=2022-05-31 06:49:50 +0000 chunk="5e049206c09f2d256117b00ab622815b" error_class=Fluent::Plugin::ElasticsearchOutput::RecoverableRequestFailure error="could not push logs to Elasticsearch cluster ({:host=>\"elasticsearch\", :port=>9200, :scheme=>\"https\"}): Couldn't connect to server"
        2022-05-31 06:49:49 +0000 [warn]: /usr/local/share/gems/gems/fluent-plugin-elasticsearch-5.2.1/lib/fluent/plugin/out_elasticsearch.rb:1139:in `rescue in send_bulk'
        2022-05-31 06:49:49 +0000 [warn]: /usr/local/share/gems/gems/fluent-plugin-elasticsearch-5.2.1/lib/fluent/plugin/out_elasticsearch.rb:1101:in `send_bulk'
        2022-05-31 06:49:49 +0000 [warn]: /usr/local/share/gems/gems/fluent-plugin-elasticsearch-5.2.1/lib/fluent/plugin/out_elasticsearch.rb:879:in `block in write'
        2022-05-31 06:49:49 +0000 [warn]: /usr/local/share/gems/gems/fluent-plugin-elasticsearch-5.2.1/lib/fluent/plugin/out_elasticsearch.rb:878:in `each'
        2022-05-31 06:49:49 +0000 [warn]: /usr/local/share/gems/gems/fluent-plugin-elasticsearch-5.2.1/lib/fluent/plugin/out_elasticsearch.rb:878:in `write'
        2022-05-31 06:49:49 +0000 [warn]: /usr/local/share/gems/gems/fluentd-1.14.5/lib/fluent/plugin/output.rb:1179:in `try_flush'
        2022-05-31 06:49:49 +0000 [warn]: /usr/local/share/gems/gems/fluentd-1.14.5/lib/fluent/plugin/output.rb:1495:in `flush_thread_run'
        2022-05-31 06:49:49 +0000 [warn]: /usr/local/share/gems/gems/fluentd-1.14.5/lib/fluent/plugin/output.rb:499:in `block (2 levels) in start'
        2022-05-31 06:49:49 +0000 [warn]: /usr/local/share/gems/gems/fluentd-1.14.5/lib/fluent/plugin_helper/thread.rb:78:in `block in thread_create'
      

      Steps to reproduce:

      1. Deploy logging 5.5
      2. Create cl instance with https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/logging/clusterlogging/storageclass_name.yaml
      3. Check collector pod logs
      oc get pods
      NAME                                            READY   STATUS    RESTARTS   AGE
      cluster-logging-operator-58566b6657-zr74h       1/1     Running   0          75m
      collector-2kcjm                                 2/2     Running   0          10m
      collector-gpxrc                                 2/2     Running   0          9m58s
      collector-lqj5t                                 2/2     Running   0          10m
      collector-rncmm                                 2/2     Running   0          9m44s
      collector-sl49p                                 2/2     Running   0          10m
      collector-tjfr4                                 2/2     Running   0          9m29s
      elasticsearch-cdm-epozn97p-1-558c9986cf-znd4z   2/2     Running   0          11m
      elasticsearch-cdm-epozn97p-2-bfcfb5774-t8zpp    2/2     Running   0          11m
      elasticsearch-cdm-epozn97p-3-7c47bb48f-rckrc    2/2     Running   0          10m
      kibana-5f7945666f-ks477                         2/2     Running   0          2m22s

      Expected:
      collector should flush logs to ES without issues.

      Actual:
      collector is stopping to send logs to ES showing above error message.

        1. CL instance
          3 kB
        2. collector logs
          12 kB

            Unassigned Unassigned
            gkarager Giriyamma Karagere Ramaswamy (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: