-
Bug
-
Resolution: Unresolved
-
Blocker
-
None
-
None
-
False
-
None
-
False
-
-
-
Critical
-
No
We have seen two cases so far where a hosted cluster's cluster-api-cert certificate gets stuck in a Ready: False state with this error in the cert-manager logs:
E0518 12:55:37.906768 1 sync.go:545] cert-manager/orders "msg"="failed to finalize Order resource due to bad request, marking Order as failed" "error"="404 urn:ietf:params:acme:error:malformed: Certificate not found" "resource_kind"="Order" "resource_name"="cluster-api-cert-fbjsx-23424524" "resource_namespace"="ocm-staging-23pt21ffpvgu3hrl4q9fr0vhme8jdsoh" "resource_version"="v1"
and then a re-issue is triggered one hour later:
I0518 13:55:37.000634 1 trigger_controller.go:200] cert-manager/certificates-trigger "msg"="Certificate must be re-issued" "key"="ocm-staging-23pt21ffpvgu3hrl4q9fr0vhme8jdsoh/cluster-api-cert" "message"="Issuing certificate as Secret does not exist" "reason"="DoesNotExist"
Ideally it never fails, but if it does fail we should look into the possibility of retrying sooner as our SLO for install-time is 10 minutes total.
Ref: