Loading...

XML

Word

Printable

Type: Bug
Resolution: Duplicate
Priority: Critical
Fix Version/s: None
Affects Version/s: 4.14
Component/s: Cloud Compute / Unknown
Labels:

Severity:
Moderate
Regression:
No
Sprint:
CLOUD Sprint 249, CLOUD Sprint 250, CLOUD Sprint 251, CLOUD Sprint 252, CLOUD Sprint 253, CLOUD Sprint 254, CLOUD Sprint 255, CLOUD Sprint 256, CLOUD Sprint 257, CLOUD Sprint 258, CLOUD Sprint 259, CLOUD Sprint 260, CLOUD Sprint 261, CLOUD Sprint 263, CLOUD Sprint 264, CLOUD Sprint 262, CLOUD Sprint 265, CLOUD Sprint 266
sprint_count:
18
Release Blocker:
Rejected
Blocked:
False
Blocked Reason:

Hide

None

Show
None
RH Private Keywords:
Target Backport Versions:

4.14.z, 4.15.z

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Impact Score:
PX Priority Data:

Description of problem:

Machine API does not finish reconciling new machine after a timeout occurs

Version-Release number of selected component (if applicable):

4.14

How reproducible:

The timing to force the issue is difficult, but may be able to force issue with unit test injection.

Steps to Reproduce:

1. Scale machine higher than current available
2. Have kube-apiserver / etcd timeout during period when mapi attempts to update machine info when transitioning to provisioned.

Actual results:

After waiting 40+ minutes from time issue occurs, the machine never moves to provisioned even though machine vm is created.

Expected results:

Machine moves to provisioned state after cloning is completed.

Additional info:

In most cases I would agree infrastructure should be better to prevent this scenario from happening; however, CI infrastructure is going to be high at times and if we cannot recover from timeouts when attempting to progress to Provisioned, we'll have many unneeded CI failures.

This issue is not marked as high severity, but it would be great if we can improve the vsphere machine provisioning process to be able to recover from this scenario and eventually mark the machine as provisioned so the CI tests can complete.

depends on

OCPBUGS-48105 [vsphere] Machine stuck in Provisioning status when machine is power off

Closed

OCPBUGS-48245 [vsphere] Machine stuck in Provisioning status when machine is power off

Closed

impacts account

OCPBUGS-48245 [vsphere] Machine stuck in Provisioning status when machine is power off

Closed

Assignee:: Nolan Brubaker

Reporter:: Neil Girard

QA Contact:: Huali Liu

Votes:: 2 Vote for this issue

Watchers:: 15 Start watching this issue

Created:: 2023/07/12 3:56 PM

Updated:: 2025/02/08 6:20 AM

Resolved:: 2025/02/08 6:20 AM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates