Uploaded image for project: 'OpenShift Request For Enhancement'
  1. OpenShift Request For Enhancement
  2. RFE-5683

[CEE.neXT]: Provide a more descriptive error when etcd static pod is unable to connect to bootstrap server.

XMLWordPrintable

    • Icon: Feature Request Feature Request
    • Resolution: Done
    • Icon: Undefined Undefined
    • None
    • openshift-4.13.z
    • etcd
    • False
    • None
    • False
    • Not Selected

      1. Proposed title of this feature request.

      Enhance etcd error logs when over an UPI install static pods are unable to connect to bootstrap.

       

      2. What is the nature and description of the request?

      When a customer deploys an UPI cluster, they normally need to manually create access rules between nodes (which are listed in our documentation).  The problem is, if the network requirements related to etcd are not in place (master nodes are unable to reach port tcp/2379 from bootstrap node), the cluster installation fails causing static etcd pods to stay in CrashLoopBackof state. The problem is the error doesn't provide the full information about why it is Crashing:

       

       

      {"level":"warn","ts":"2024-06-28T12:16:50.859825Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCDCTL_ENDPOINTS="}
      {"level":"warn","ts":"2024-06-28T12:16:55.88266Z","logger":"etcd-client","caller":"v3@v3.5.9/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0001ba000/127.0.0.1:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest balancer error: last connection error: connection error: desc = \"transport: Error while dialing: dial tcp 127.0.0.1:2379: connect: connection refused\""}
      Error: context deadline exceeded
      failed to create etcd client: context deadline exceeded  

      It would be of great help for customers if the related ip address that etcd is not reaching could be listed, so they could double check if all firewall rules/security groups settings are allowing the required traffic.

       

      3. Why does the customer need this? (List the business requirements here)

      To correctly identify the source of the problem when they are unable to deploy an UPI cluster.

      4. List any affected packages or components.

      • etcd

              racedoro@redhat.com Ramon Acedo
              rhn-gps-alfredo Alfredo Pizarro
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: