Loading...

XML

Word

Printable

Type: Epic
Resolution: Duplicate
Priority: Major
Fix Version/s: None
Affects Version/s: None
Component/s: None
Labels:
None

Epic Name:
Cluster API Upstream bubble up conditions and propagate labels
Blocked:
False
Blocked Reason:
None
Ready:
False
Color Status:
Not Selected
Epic Status:
To Do
Target Version:

openshift-4.13

Cost of Delay:
0
WSJF:
0
Risk Score:
0

SFDC Cases Links:
SFDC Cases Counter:
SFDC Cases Open:

Intelligence Requested:
Market:

There's some work needed in Cluster API upstream which is critical for supporting desired NodePool behaviour and value.

1 - Bubble up conditions from infraMachines resources.

This is critical to signal failure scenarios to consumers.

kubernetes-sigs/cluster-api#6218
kubernetes-sigs/cluster-api#6025

We are workarounding this with https://github.com/openshift/hypershift/pull/1907.

This is suboptimal implementation for multiple reasons, e.g. it does not scale well for huge NodePools, it misses meaningful messages from the infraMachine, e.g

The infraMachine shows

```

- lastTransitionTime: "2022-11-30T13:05:21Z"
message: "failed to create AWSMachine instance: failed to run instance: InvalidParameterValue:
Value (20756lq9aha19chgodjd8krdtr49q8o6-worker-profile) for parameter iamInstanceProfile.name
is invalid. Invalid IAM Instance Profile name\n\tstatus code: 400, request id:
cda1113b-2f2d-4c33-a46e-961401ec03c7"
reason: InstanceProvisionFailed

```

But the workaround can only consume from Machines which show

```

- lastTransitionTime: "2022-11-30T12:46:36Z"
message: 0 of 2 completed
reason: InstanceProvisionFailed
severity: Error
status: "False"
type: InfrastructureReady

```

2 - Propagate labels

This is critical UX tag pool of Nodes and keep it after replace upgrades. A common use case is to tag a pool as "infra" which directly affects to cost.

https://github.com/kubernetes-sigs/cluster-api/pull/7173

duplicates

HOSTEDCP-977 Enable CAPI to bubble up clear machine failure conditions

Closed

Assignee:: Unassigned

Reporter:: Alberto Garcia Lamela

QA Contact:: Jie Zhao

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Created:: 2022/11/30 1:16 PM

Updated:: 2023/05/04 9:21 AM

Resolved:: 2023/05/04 9:21 AM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates