-
Bug
-
Resolution: Not a Bug
-
Normal
-
None
-
4.16, 4.17, 4.18
-
Moderate
-
None
-
False
-
-
-
Description of problem:
A [HOLDOVER] state transition event is generated in cloud-event-proxy after Boundary Clock master interface goes down. This error occurs often during testing 4.16, 4.17 and 4.18, but it does not occur every time.
Test log:
06:52:14 should have no effect when Boundary Clock master interface goes down and up [49734, test_id:49734] 06:52:14 /var/lib/jenkins/workspace/ocp-far-edge-vran-tests/cnf-gotests/test/ran/ptp/tests/ptp_interfaces.go:115 06:52:14 > Enter [BeforeEach] PTP Events and Metrics - interface down - /var/lib/jenkins/workspace/ocp-far-edge-vran-tests/cnf-gotests/test/util/execute/ginkgo.go:10 @ 10/25/24 06:52:11.816 06:52:14 < Exit [BeforeEach] PTP Events and Metrics - interface down - /var/lib/jenkins/workspace/ocp-far-edge-vran-tests/cnf-gotests/test/util/execute/ginkgo.go:10 @ 10/25/24 06:52:11.816 (0s) 06:52:14 > Enter [BeforeEach] PTP Events and Metrics - interface down - /var/lib/jenkins/workspace/ocp-far-edge-vran-tests/cnf-gotests/test/ran/ptp/tests/ptp_interfaces.go:47 @ 10/25/24 06:52:11.816 06:52:17 2024/10/25 06:52:16 Reached PTP clock state 1 for all interfaces 06:52:17 < Exit [BeforeEach] PTP Events and Metrics - interface down - /var/lib/jenkins/workspace/ocp-far-edge-vran-tests/cnf-gotests/test/ran/ptp/tests/ptp_interfaces.go:47 @ 10/25/24 06:52:16.997 (5.181s) 06:52:17 > Enter [It] should have no effect when Boundary Clock master interface goes down and up - /var/lib/jenkins/workspace/ocp-far-edge-vran-tests/cnf-gotests/test/ran/ptp/tests/ptp_interfaces.go:115 @ 10/25/24 06:52:16.998 06:52:17 STEP: Bring down ptp Boundary Clock master interfaces [ens1f2 ens1f0 ens1f1] on node helix66.lab.eng.rdu2.redhat.com 06:52:17 - /var/lib/jenkins/workspace/ocp-far-edge-vran-tests/cnf-gotests/test/ran/ptp/tests/ptp_interfaces.go:141 @ 10/25/24 06:52:17.402 06:52:20 STEP: Validate PTP metric stays in locked after bringing down BC master interface - /var/lib/jenkins/workspace/ocp-far-edge-vran-tests/cnf-gotests/test/ran/ptp/tests/ptp_interfaces.go:149 @ 10/25/24 06:52:19.923 06:52:58 STEP: Validate NO [HOLDOVER] state transition event is generated after master interface ens1f2 goes down on node helix66.lab.eng.rdu2.redhat.com - /var/lib/jenkins/workspace/ocp-far-edge-vran-tests/cnf-gotests/test/ran/ptp/tests/ptp_interfaces.go:155 @ 10/25/24 06:52:55.2862024/10/25 06:52:55 Reached PTP clock state 1 for ens1fx for 30s seconds 06:52:58 06:52:58 2024/10/25 06:52:55 Info: event.sync.ptp-status.ptp-state-change HOLDOVER is found for resource(s): ens1fx
cloud-event-proxy log shows [HOLDOVER] state transitions for master:
06:52:58 2024/10/25 06:52:55 Logs from last 1m0s for pod linuxptp-daemon-5nlq8 container cloud-event-proxy: 06:52:58 time="2024-10-25T10:51:57Z" level=info msg="got status check call,send events for subscriber e1d3fbeb-16a8-4de3-9369-587b3e65f208 => /cluster/node/helix66.lab.eng.rdu2.redhat.com/sync/sync-status/os-clock-sync-state" 06:52:58 time="2024-10-25T10:51:57Z" level=info msg="got status check call,send events for subscriber c22f2c88-627d-4d0c-8908-4e2260e401ea => /cluster/node/helix66.lab.eng.rdu2.redhat.com/sync/ptp-status/clock-class" 06:52:58 time="2024-10-25T10:51:57Z" level=info msg="got status check call,send events for subscriber 36b5cab3-4c9a-49d0-b0aa-34003e8282b9 => /cluster/node/helix66.lab.eng.rdu2.redhat.com/sync/ptp-status/lock-state" 06:52:58 time="2024-10-25T10:51:57Z" level=info msg="got status check call,send events for subscriber 9adafd22-3518-44bc-8dff-b2a56d3135e1 => /cluster/node/helix66.lab.eng.rdu2.redhat.com/sync/gnss-status/gnss-sync-status" 06:52:58 time="2024-10-25T10:51:57Z" level=error msg="could not find any events for requested resource type /cluster/node/helix66.lab.eng.rdu2.redhat.com/sync/gnss-status/gnss-sync-status" 06:52:58 time="2024-10-25T10:51:57Z" level=info msg="got status check call,send events for subscriber 65b90f9e-9b4c-4a48-bc33-51a96c2cea4d => /cluster/node/helix66.lab.eng.rdu2.redhat.com/sync/sync-status/sync-state" 06:52:58 time="2024-10-25T10:51:57Z" level=warning msg="last sync state is unknown: " 06:52:58 time="2024-10-25T10:51:57Z" level=warning msg="last sync state is unknown: " 06:52:58 time="2024-10-25T10:51:57Z" level=warning msg="last sync state is unknown: " 06:52:58 time="2024-10-25T10:52:18Z" level=info msg="update interface ens1f0 with portid 4 from role MASTER to role FAULTY" 06:52:58 time="2024-10-25T10:52:18Z" level=info msg="update interface eno12409np1 with portid 1 from role SLAVE to role LISTENING" 06:52:58 time="2024-10-25T10:52:18Z" level=info msg="master process name ptp4l and masteroffsetsource ptp4l" 06:52:58 time="2024-10-25T10:52:18Z" level=debug msg="event source /sync/ptp-status/lock-state sent to queue to process" 06:52:58 time="2024-10-25T10:52:18Z" level=debug msg="event sent {\n \"id\": \"1cf9b8a9-ee2d-42fe-9f77-7f93499e1919\",\n \"type\": \"event.sync.ptp-status.ptp-state-change\",\n \"source\": \"/sync/ptp-status/lock-state\",\n \"dataContentType\": \"application/json\",\n \"time\": \"2024-10-25T10:52:18.588559179Z\",\n \"data\": {\n \"version\": \"1.0\",\n \"values\": [\n {\n \"ResourceAddress\": \"/cluster/node/helix66.lab.eng.rdu2.redhat.com/eno12409npx/master\",\n \"data_type\": \"notification\",\n \"value_type\": \"enumeration\",\n \"value\": \"HOLDOVER\"\n },\n {\n \"ResourceAddress\": \"/cluster/node/helix66.lab.eng.rdu2.redhat.com/eno12409npx/master\",\n \"data_type\": \"metric\",\n \"value_type\": \"decimal64.3\",\n \"value\": \"-3\"\n }\n ]\n }\n }" 06:52:58 time="2024-10-25T10:52:18Z" level=debug msg="event source /sync/sync-status/sync-state sent to queue to process" 06:52:58 time="2024-10-25T10:52:18Z" level=debug msg="event sent {\n \"id\": \"8552e27e-597a-46c0-a83b-4d284209dbd8\",\n \"type\": \"event.sync.sync-status.synchronization-state-change\",\n \"source\": \"/sync/sync-status/sync-state\",\n \"dataContentType\": \"application/json\",\n \"time\": \"2024-10-25T10:52:18.588746719Z\",\n \"data\": {\n \"version\": \"1.0\",\n \"values\": [\n {\n \"ResourceAddress\": \"/cluster/node/helix66.lab.eng.rdu2.redhat.com/eno12409npx/master\",\n \"data_type\": \"notification\",\n \"value_type\": \"enumeration\",\n \"value\": \"HOLDOVER\"\n }\n ]\n }\n }" 06:52:58 time="2024-10-25T10:52:18Z" level=info msg="post events /cluster/node/helix66.lab.eng.rdu2.redhat.com/sync/ptp-status/lock-state to subscriber http://consumer-events-subscription-service.cloud-events.svc.cluster.local:9043/event" 06:52:58 time="2024-10-25T10:52:18Z" level=info msg="post events /cluster/node/helix66.lab.eng.rdu2.redhat.com/sync/sync-status/sync-state to subscriber http://consumer-events-subscription-service.cloud-events.svc.cluster.local:9043/event" 06:52:58 time="2024-10-25T10:52:18Z" level=error msg="error parsing port id strconv.Atoi: parsing \"507c6f.fffe.56b37f-4\": invalid syntax for output port 507c6f.fffe.56b37f-4 changed state" 06:52:58 time="2024-10-25T10:52:19Z" level=error msg="error parsing port id strconv.Atoi: parsing \"507c6f.fffe.56b37f-1\": invalid syntax for output port 507c6f.fffe.56b37f-1 changed state" 06:52:58 time="2024-10-25T10:52:19Z" level=info msg=" publishing event for (profile bc1) ens1fx/master with last state LOCKED and current clock state FREERUN and offset -4 for ( Max/Min Threshold 100/-100 )" 06:52:58 time="2024-10-25T10:52:19Z" level=debug msg="event source /sync/ptp-status/lock-state sent to queue to process" 06:52:58 time="2024-10-25T10:52:19Z" level=debug msg="event sent {\n \"id\": \"1cf9b8a9-ee2d-42fe-9f77-7f93499e1919\",\n \"type\": \"event.sync.ptp-status.ptp-state-change\",\n \"source\": \"/sync/ptp-status/lock-state\",\n \"dataContentType\": \"application/json\",\n \"time\": \"2024-10-25T10:52:19.01447821Z\",\n \"data\": {\n \"version\": \"1.0\",\n \"values\": [\n {\n \"ResourceAddress\": \"/cluster/node/helix66.lab.eng.rdu2.redhat.com/ens1fx/master\",\n \"data_type\": \"notification\",\n \"value_type\": \"enumeration\",\n \"value\": \"FREERUN\"\n },\n {\n \"ResourceAddress\": \"/cluster/node/helix66.lab.eng.rdu2.redhat.com/ens1fx/master\",\n \"data_type\": \"metric\",\n \"value_type\": \"decimal64.3\",\n \"value\": \"-4\"\n }\n ]\n }\n }" 06:52:58 time="2024-10-25T10:52:19Z" level=debug msg="event source /sync/sync-status/sync-state sent to queue to process" 06:52:58 time="2024-10-25T10:52:19Z" level=debug msg="event sent {\n \"id\": \"8552e27e-597a-46c0-a83b-4d284209dbd8\",\n \"type\": \"event.sync.sync-status.synchronization-state-change\",\n \"source\": \"/sync/sync-status/sync-state\",\n \"dataContentType\": \"application/json\",\n \"time\": \"2024-10-25T10:52:19.014716329Z\",\n \"data\": {\n \"version\": \"1.0\",\n \"values\": [\n {\n \"ResourceAddress\": \"/cluster/node/helix66.lab.eng.rdu2.redhat.com/ens1fx/master\",\n \"data_type\": \"notification\",\n \"value_type\": \"enumeration\",\n \"value\": \"FREERUN\"\n }\n ]\n }\n }" 06:52:58 time="2024-10-25T10:52:19Z" level=info msg="update interface ens1f3 with portid 1 from role SLAVE to role FAULTY" 06:52:58 time="2024-10-25T10:52:19Z" level=info msg="master process name ptp4l and masteroffsetsource ptp4l" 06:52:58 time="2024-10-25T10:52:19Z" level=debug msg="event source /sync/ptp-status/lock-state sent to queue to process" 06:52:58 time="2024-10-25T10:52:19Z" level=debug msg="event sent {\n \"id\": \"1cf9b8a9-ee2d-42fe-9f77-7f93499e1919\",\n \"type\": \"event.sync.ptp-status.ptp-state-change\",\n \"source\": \"/sync/ptp-status/lock-state\",\n \"dataContentType\": \"application/json\",\n \"time\": \"2024-10-25T10:52:19.014833825Z\",\n \"data\": {\n \"version\": \"1.0\",\n \"values\": [\n {\n \"ResourceAddress\": \"/cluster/node/helix66.lab.eng.rdu2.redhat.com/ens1fx/master\",\n \"data_type\": \"notification\",\n \"value_type\": \"enumeration\",\n \"value\": \"HOLDOVER\"\n },\n {\n \"ResourceAddress\": \"/cluster/node/helix66.lab.eng.rdu2.redhat.com/ens1fx/master\",\n \"data_type\": \"metric\",\n \"value_type\": \"decimal64.3\",\n \"value\": \"-4\"\n }\n ]\n }\n }" 06:52:58 time="2024-10-25T10:52:19Z" level=debug msg="event source /sync/sync-status/sync-state sent to queue to process" 06:52:58 time="2024-10-25T10:52:19Z" level=debug msg="event sent {\n \"id\": \"8552e27e-597a-46c0-a83b-4d284209dbd8\",\n \"type\": \"event.sync.sync-status.synchronization-state-change\",\n \"source\": \"/sync/sync-status/sync-state\",\n \"dataContentType\": \"application/json\",\n \"time\": \"2024-10-25T10:52:19.014907061Z\",\n \"data\": {\n \"version\": \"1.0\",\n \"values\": [\n {\n \"ResourceAddress\": \"/cluster/node/helix66.lab.eng.rdu2.redhat.com/ens1fx/master\",\n \"data_type\": \"notification\",\n \"value_type\": \"enumeration\",\n \"value\": \"HOLDOVER\"\n }\n ]\n }\n }" 06:52:58 time="2024-10-25T10:52:19Z" level=info msg="update interface ens1f3 with portid 1 from role FAULTY to role MASTER" 06:52:58 time="2024-10-25T10:52:19Z" level=info msg="post events /cluster/node/helix66.lab.eng.rdu2.redhat.com/sync/ptp-status/lock-state to subscriber http://consumer-events-subscription-service.cloud-events.svc.cluster.local:9043/event" 06:52:58 time="2024-10-25T10:52:19Z" level=info msg="update interface ens1f3 with portid 1 from role MASTER to role FAULTY" 06:52:58 time="2024-10-25T10:52:19Z" level=error msg="error parsing port id strconv.Atoi: parsing \"507c6f.fffe.56b37f-1\": invalid syntax for output port 507c6f.fffe.56b37f-1 changed state" 06:52:58 time="2024-10-25T10:52:19Z" level=error msg="error parsing port id strconv.Atoi: parsing \"state\": invalid syntax for output port state change" 06:52:58 time="2024-10-25T10:52:19Z" level=error msg="error parsing port id strconv.Atoi: parsing \"507c6f.fffe.56b37f-1\": invalid syntax for output port 507c6f.fffe.56b37f-1 changed state" 06:52:58 time="2024-10-25T10:52:19Z" level=info msg="post events /cluster/node/helix66.lab.eng.rdu2.redhat.com/sync/sync-status/sync-state to subscriber http://consumer-events-subscription-service.cloud-events.svc.cluster.local:9043/event" 06:52:58 time="2024-10-25T10:52:19Z" level=info msg="post events /cluster/node/helix66.lab.eng.rdu2.redhat.com/sync/ptp-status/lock-state to subscriber http://consumer-events-subscription-service.cloud-events.svc.cluster.local:9043/event" 06:52:58 time="2024-10-25T10:52:19Z" level=info msg="post events /cluster/node/helix66.lab.eng.rdu2.redhat.com/sync/sync-status/sync-state to subscriber http://consumer-events-subscription-service.cloud-events.svc.cluster.local:9043/event" 06:52:58 time="2024-10-25T10:52:19Z" level=info msg="Interface ens1f3 is no longer faulty. The holdover state will remain active until the PTP sync state is detected as LOCKED or the holdover times out." 06:52:58 time="2024-10-25T10:52:19Z" level=info msg="update interface ens1f3 with portid 1 from role FAULTY to role SLAVE" 06:52:58 time="2024-10-25T10:52:19Z" level=info msg="interface ens1fx/master is in LOCKED state, cancel any holdover states" 06:52:58 time="2024-10-25T10:52:19Z" level=info msg=" publishing event for ( profile bc1) ens1fx/master with last state HOLDOVER and current clock state LOCKED and offset -5 for ( Max/Min Threshold 100/-100 )" 06:52:58 time="2024-10-25T10:52:19Z" level=info msg="call received to close holderover timeout" 06:52:58 time="2024-10-25T10:52:19Z" level=info msg="exiting holdover for interface master" 06:52:58 time="2024-10-25T10:52:19Z" level=debug msg="event source /sync/ptp-status/lock-state sent to queue to process" 06:52:58 time="2024-10-25T10:52:19Z" level=info msg="post events /cluster/node/helix66.lab.eng.rdu2.redhat.com/sync/ptp-status/lock-state to subscriber http://consumer-events-subscription-service.cloud-events.svc.cluster.local:9043/event" 06:52:58 time="2024-10-25T10:52:19Z" level=debug msg="event sent {\n \"id\": \"1cf9b8a9-ee2d-42fe-9f77-7f93499e1919\",\n \"type\": \"event.sync.ptp-status.ptp-state-change\",\n \"source\": \"/sync/ptp-status/lock-state\",\n \"dataContentType\": \"application/json\",\n \"time\": \"2024-10-25T10:52:19.183618661Z\",\n \"data\": {\n \"version\": \"1.0\",\n \"values\": [\n {\n \"ResourceAddress\": \"/cluster/node/helix66.lab.eng.rdu2.redhat.com/ens1fx/master\",\n \"data_type\": \"notification\",\n \"value_type\": \"enumeration\",\n \"value\": \"LOCKED\"\n },\n {\n \"ResourceAddress\": \"/cluster/node/helix66.lab.eng.rdu2.redhat.com/ens1fx/master\",\n \"data_type\": \"metric\",\n \"value_type\": \"decimal64.3\",\n \"value\": \"-5\"\n }\n ]\n }\n }" 06:52:58 time="2024-10-25T10:52:19Z" level=debug msg="event source /sync/sync-status/sync-state sent to queue to process" 06:52:58 time="2024-10-25T10:52:19Z" level=debug msg="event sent {\n \"id\": \"8552e27e-597a-46c0-a83b-4d284209dbd8\",\n \"type\": \"event.sync.sync-status.synchronization-state-change\",\n \"source\": \"/sync/sync-status/sync-state\",\n \"dataContentType\": \"application/json\",\n \"time\": \"2024-10-25T10:52:19.183812275Z\",\n \"data\": {\n \"version\": \"1.0\",\n \"values\": [\n {\n \"ResourceAddress\": \"/cluster/node/helix66.lab.eng.rdu2.redhat.com/ens1fx/master\",\n \"data_type\": \"notification\",\n \"value_type\": \"enumeration\",\n \"value\": \"LOCKED\"\n }\n ]\n }\n }" 06:52:58 time="2024-10-25T10:52:19Z" level=info msg="post events /cluster/node/helix66.lab.eng.rdu2.redhat.com/sync/sync-status/sync-state to subscriber http://consumer-events-subscription-service.cloud-events.svc.cluster.local:9043/event" 06:52:58 time="2024-10-25T10:52:20Z" level=error msg="error parsing port id strconv.Atoi: parsing \"507c6f.fffe.56b37f-1\": invalid syntax for output port 507c6f.fffe.56b37f-1 changed state" 06:52:58 time="2024-10-25T10:52:20Z" level=error msg="error parsing port id strconv.Atoi: parsing \"state\": invalid syntax for output port state change" 06:52:58 time="2024-10-25T10:52:23Z" level=info msg="time expired for interface master" 06:52:58 time="2024-10-25T10:52:23Z" level=info msg="HOLDOVER timeout after 5 secs,setting clock state to FREERUN from HOLDOVER state for master" 06:52:58 time="2024-10-25T10:52:23Z" level=debug msg="event source /sync/ptp-status/lock-state sent to queue to process" 06:52:58 time="2024-10-25T10:52:23Z" level=debug msg="event sent {\n \"id\": \"1cf9b8a9-ee2d-42fe-9f77-7f93499e1919\",\n \"type\": \"event.sync.ptp-status.ptp-state-change\",\n \"source\": \"/sync/ptp-status/lock-state\",\n \"dataContentType\": \"application/json\",\n \"time\": \"2024-10-25T10:52:23.590151904Z\",\n \"data\": {\n \"version\": \"1.0\",\n \"values\": [\n {\n \"ResourceAddress\": \"/cluster/node/helix66.lab.eng.rdu2.redhat.com/eno12409npx/master\",\n \"data_type\": \"notification\",\n \"value_type\": \"enumeration\",\n \"value\": \"FREERUN\"\n },\n {\n \"ResourceAddress\": \"/cluster/node/helix66.lab.eng.rdu2.redhat.com/eno12409npx/master\",\n \"data_type\": \"metric\",\n \"value_type\": \"decimal64.3\",\n \"value\": \"-3\"\n }\n ]\n }\n }" 06:52:58 time="2024-10-25T10:52:23Z" level=debug msg="event source /sync/sync-status/sync-state sent to queue to process" 06:52:58 time="2024-10-25T10:52:23Z" level=debug msg="event sent {\n \"id\": \"8552e27e-597a-46c0-a83b-4d284209dbd8\",\n \"type\": \"event.sync.sync-status.synchronization-state-change\",\n \"source\": \"/sync/sync-status/sync-state\",\n \"dataContentType\": \"application/json\",\n \"time\": \"2024-10-25T10:52:23.590306126Z\",\n \"data\": {\n \"version\": \"1.0\",\n \"values\": [\n {\n \"ResourceAddress\": \"/cluster/node/helix66.lab.eng.rdu2.redhat.com/eno12409npx/master\",\n \"data_type\": \"notification\",\n \"value_type\": \"enumeration\",\n \"value\": \"FREERUN\"\n }\n ]\n }\n }" 06:52:58 time="2024-10-25T10:52:23Z" level=info msg="phc2sys is not enabled for profile slave-bc2, skiping os clock syn state " 06:52:58 time="2024-10-25T10:52:23Z" level=info msg="exiting holdover for interface master" 06:52:58 time="2024-10-25T10:52:23Z" level=info msg="post events /cluster/node/helix66.lab.eng.rdu2.redhat.com/sync/ptp-status/lock-state to subscriber http://consumer-events-subscription-service.cloud-events.svc.cluster.local:9043/event" 06:52:58 time="2024-10-25T10:52:23Z" level=info msg="post events /cluster/node/helix66.lab.eng.rdu2.redhat.com/sync/sync-status/sync-state to subscriber http://consumer-events-subscription-service.cloud-events.svc.cluster.local:9043/event"
Version-Release number of selected component (if applicable):
4.16, 4.17, 4.18
How reproducible:
frequent, but not 100%
Steps to Reproduce:
1. Deploy SNO with OC/BC configuration 2. Bring down ptp Boundary Clock master interfaces 3. Observe linuxptp-daemon cloud-event-proxy for event.sync.sync-status.synchronization-state-change to HOLDOVER for master
Actual results:
level=debug msg="event sent {\n \"id\": \"8552e27e-597a-46c0-a83b-4d284209dbd8\",\n \"type\": \"event.sync.sync-status.synchronization-state-change\",\n \"source\": \"/sync/sync-status/sync-state\",\n \"dataContentType\": \"application/json\",\n \"time\": \"2024-10-25T10:52:19.014907061Z\",\n \"data\": {\n \"version\": \"1.0\",\n \"values\": [\n {\n \"ResourceAddress\": \"/cluster/node/helix66.lab.eng.rdu2.redhat.com/ens1fx/master\",\n \"data_type\": \"notification\",\n \"value_type\": \"enumeration\",\n \"value\": \"HOLDOVER\"\n }\n ]\n }\n }"
Expected results:
No holdover state change on master
Additional info:
- is duplicated by
-
OCPBUGS-45674 HOLDOVER and FREERUN events are received when BC master interface is taken down
-
- Closed
-