Uploaded image for project: 'OpenShift Request For Enhancement'
  1. OpenShift Request For Enhancement
  2. RFE-4604

Alerting when Cluster Autoscaler cannot scale

XMLWordPrintable

    • False
    • None
    • False
    • Not Selected
    • x86_64
    • 0
    • 0% 0%
    • Red Hat OpenShift Container Platform, Red Hat OpenShift Service on Amazon

      1. Proposed title of this feature request

      Alerting when Cluster Autoscaler cannot scale

      2. What is the nature and description of the request?

      In Support Case 03596207 the customer observed an issue with the (AWS) API used to create new machines. As a result, the Machine API / Cluster Autoscaler was not able to provision new machines, leading to Pending Pods. Specifically, the provisioning failed with the following AWS-specific issue:

      I0824 20:32:55.664808       1 machine_scope.go:167] example-worker-eu-central-1c-xxxx: Updating status
      I0824 20:32:55.664814       1 machine_scope.go:193] example-worker-eu-central-1c-xxxx: finished calculating AWS status
      I0824 20:32:55.664823       1 machine_scope.go:90] example-worker-eu-central-1c-xxxx: patching machine
      E0824 20:32:55.675937       1 actuator.go:72] example-worker-eu-central-1c-xxxx error: example-worker-eu-central-1c-xxxx: reconciler failed to Create machine: failed to launch instance: error launching instan
      ce: In order to use this AWS Marketplace product you need to accept terms and subscribe. To do so please visit https://aws.amazon.com/marketplace/pp?sku=xxxxxxxxxx
      W0824 20:32:55.675979       1 controller.go:382] example-worker-eu-central-1c-xxxx: failed to create machine: example-worker-eu-central-1c-xxxx: reconciler failed to Create machine: failed to launch instance:
       error launching instance: In order to use this AWS Marketplace product you need to accept terms and subscribe. To do so please visit https://aws.amazon.com/marketplace/pp?sku=xxxxxxxxxx
      I0824 20:32:55.675988       1 controller.go:422] Actuator returned invalid configuration error: error launching instance: In order to use this AWS Marketplace product you need to accept terms and subscribe. To do so please visit https:/
      /aws.amazon.com/marketplace/pp?sku=xxxxxxxxxx
      I0824 20:32:55.676078       1 recorder.go:103] events "msg"="example-worker-eu-central-1c-xxxx: reconciler failed to Create machine: failed to launch instance: error launching instance: In order to use this AWS Marketplace
       product you need to accept terms and subscribe. To do so please visit https://aws.amazon.com/marketplace/pp?sku=xxxxxxxxxx" "object"={"kind":"Machine","namespace":"openshift-machine-api","name":"example-work
      er-eu-central-1c-xxxx","uid":"xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxx","apiVersion":"machine.openshift.io/v1beta1","resourceVersion":"00000000"} "reason"="FailedCreate" "type"="Warning"

      The customer noticed the issue only when an end-user mentioned the Pending Pods. This request asks to have cluster alerts in place that would fire if such an issue occurs.

      3. Why does the customer need this? (List the business requirements here)

      Having alerting in place would allow for better availability of the cluster and would allow quicker reaction in case there is an issue with the API used by the Machine API.

      4. List any affected packages or components.

      • Machine API or Cluster Autoscaler

            rh-ee-smodeel Subin MM
            rhn-support-skrenger Simon Krenger
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated: