Loading...

XML

Word

Printable

Type: Bug
Resolution: Unresolved
Priority: Normal
Fix Version/s: None
Affects Version/s: 4.21
Component/s: Two Node Fencing
Labels:
None

Activity Type:
None
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Story Points:
0
Severity:
None
Regression:
None

Target Backport Versions:
None
Target Version:

4.22
Release Blocker:
None
Sprint:
OCPEDGE Sprint 284, OCPEDGE Sprint 285
sprint_count:
2

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Impact Score:

Release Note Status:
None
Release Note Type:
None
Release Note Text:
None

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

Description:
The tnf-after-setup-job fails to be created by the cluster-etcd-operator when the cluster nodes have long hostnames (FQDNs). The operator attempts to use the full node name as a label value in the Job's pod template, violating the Kubernetes 63-character limit for label values.

During the installation of Two-Node Fencing (TNF) on OpenShift (specifically observed in version 4.21 via ZTP), the etcd ClusterOperator remains in Progressing: True state indefinitely.

The oc describe co etcd shows: Message: tnf-after-setup-job-<hostname> Progressing: Job is running

However, the Job and its associated Pods never appear in the openshift-etcd namespace. Investigation of the cluster-etcd-operator logs reveals a JobCreateFailed warning because the generated label exceeds the maximum allowed length of 63 characters.

The operator logs show the following validation error:

I0205 13:20:14.344362 1 event.go:377] Event(...): type: 'Warning' reason: 'JobCreateFailed' 
Failed to create Job.batch/tnf-after-setup-job-worker-01.cnf77.se-lab.eng.rdu2.dc.redhat.com -n openshift-etcd: 
Job.batch "tnf-after-setup-job-worker-01.cnf77.se-lab.eng.rdu2.dc.redhat.com" is invalid: 
spec.template.labels: Invalid value: "tnf-after-setup-job-worker-01.cnf77.se-lab.eng.rdu2.dc.redhat.com": 
must be no more than 63 characters

Expected Results:
The operator should handle long hostnames by either:

Truncating the hostname used in labels.
Using a hash of the hostname for the label value.
Ensuring the label value conforms to DNS_LABEL standards as defined in official Kubernetes documentation.

Steps to Reproduce:

Deploy an OpenShift cluster (version 4.21) with Two-Node Fencing enabled.
Use hostnames/FQDNs that exceed 40-50 characters (so that the prefix tnf-after-setup-job- + hostname exceeds 63 chars).
Monitor the openshift-etcd-operator logs and the etcd ClusterOperator status

Suggested Fix:
The logic within the cluster-etcd-operator that generates the Job manifest for TNF needs to implement a helper function to sanitize and truncate the spec.template.labels strings.

links to

openshift/cluster-etcd-operator#1554: OCPBUGS-76331: fix: truncate job names to respect k8s 63-char limit

Assignee:: Francesco Cappa

Reporter:: Alberto Losada

QA Contact:: Douglas Hensel

Need Info From:: None

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Created:: 2026/02/06 12:59 PM

Updated:: 2026/03/02 2:37 PM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates