-
Feature Request
-
Resolution: Unresolved
-
Normal
-
None
-
None
-
False
-
None
-
False
-
Not Selected
-
x86_64
-
-
-
-
-
Red Hat OpenShift Container Platform, Red Hat OpenShift Service on Amazon
1. Proposed title of this feature request
Alerting when Cluster Autoscaler cannot scale
2. What is the nature and description of the request?
In Support Case 03596207 the customer observed an issue with the (AWS) API used to create new machines. As a result, the Machine API / Cluster Autoscaler was not able to provision new machines, leading to Pending Pods. Specifically, the provisioning failed with the following AWS-specific issue:
I0824 20:32:55.664808 1 machine_scope.go:167] example-worker-eu-central-1c-xxxx: Updating status I0824 20:32:55.664814 1 machine_scope.go:193] example-worker-eu-central-1c-xxxx: finished calculating AWS status I0824 20:32:55.664823 1 machine_scope.go:90] example-worker-eu-central-1c-xxxx: patching machine E0824 20:32:55.675937 1 actuator.go:72] example-worker-eu-central-1c-xxxx error: example-worker-eu-central-1c-xxxx: reconciler failed to Create machine: failed to launch instance: error launching instan ce: In order to use this AWS Marketplace product you need to accept terms and subscribe. To do so please visit https://aws.amazon.com/marketplace/pp?sku=xxxxxxxxxx W0824 20:32:55.675979 1 controller.go:382] example-worker-eu-central-1c-xxxx: failed to create machine: example-worker-eu-central-1c-xxxx: reconciler failed to Create machine: failed to launch instance: error launching instance: In order to use this AWS Marketplace product you need to accept terms and subscribe. To do so please visit https://aws.amazon.com/marketplace/pp?sku=xxxxxxxxxx I0824 20:32:55.675988 1 controller.go:422] Actuator returned invalid configuration error: error launching instance: In order to use this AWS Marketplace product you need to accept terms and subscribe. To do so please visit https:/ /aws.amazon.com/marketplace/pp?sku=xxxxxxxxxx I0824 20:32:55.676078 1 recorder.go:103] events "msg"="example-worker-eu-central-1c-xxxx: reconciler failed to Create machine: failed to launch instance: error launching instance: In order to use this AWS Marketplace product you need to accept terms and subscribe. To do so please visit https://aws.amazon.com/marketplace/pp?sku=xxxxxxxxxx" "object"={"kind":"Machine","namespace":"openshift-machine-api","name":"example-work er-eu-central-1c-xxxx","uid":"xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxx","apiVersion":"machine.openshift.io/v1beta1","resourceVersion":"00000000"} "reason"="FailedCreate" "type"="Warning"
The customer noticed the issue only when an end-user mentioned the Pending Pods. This request asks to have cluster alerts in place that would fire if such an issue occurs.
3. Why does the customer need this? (List the business requirements here)
Having alerting in place would allow for better availability of the cluster and would allow quicker reaction in case there is an issue with the API used by the Machine API.
4. List any affected packages or components.
- Machine API or Cluster Autoscaler