-
Bug
-
Resolution: Not a Bug
-
Undefined
-
None
-
rhel-10.0.beta
-
None
-
No
-
Important
-
rhel-sst-high-availability
-
ssg_filesystems_storage_and_HA
-
None
-
False
-
-
None
-
None
-
None
-
None
-
None
I'm not sure that corosync is at fault here. However, the issue manifests itself in corosync, so I'm filing the bug against it.
What were you trying to do that didn't work?
When a cluster is created with short node names (i.e. no FQDN), corosync is unable to start with default setting on RHEL 10. The same procedure sets up a running cluster on RHEL 9.
Please provide the package NVR for which bug is seen:
corosync-3.1.8-4.el10.x86_64
How reproducible:
consistently, with the specific configuration:
- fresh install of RHEL 10 beta
- each node has IPv4 addresses and link-local IPv6 addresses, no IPv6 global addresses
Steps to reproduce
- pcs cluster setup mycluster rh10-node1 rh10-node2 # use short node names
- pcs cluster start --all
Expected results
cluster starts
Actual results
Cluster doesn't start. Corosync logs an error:
parse error in config: Nodes for link 0 have different IP families (compared 192.168.122.14 with fe80::5054:ff:fe90:10a)
This is on node1, the IPv4 address belongs to node2 while the IPv6 belong to node1. On node2, it's the other way around: node1 with IPv4 and node2 with IPv6.
Additional info
- When corosync is configured with 'ip_version: ipv4', then the cluster starts. This is not needed on RHEL 9, though.
- When FQDNs are used, corosync starts just fine.
- Further digging revealed a change in /etc/nsswitch.conf:
- RHEL 9 has this: 'hosts: files dns myhostname'
- RHEL10 has this: 'hosts: files myhostname dns'
- I suspect this is the root cause of the issue. When nsswitch.conf is changed to 'hosts: files dns myhostname' on RHEL 10, the issue no longer occurs and the cluster starts just fine.