Loading...

XML

Word

Printable

Type: Sub-task
Resolution: Unresolved
Priority: Normal
Fix Version/s: None
Affects Version/s: None
Component/s: None
Labels:

Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Acceptance Criteria:
None
BZ requires_doc_text:
Unset
BZ Keywords:
- Unset
Intelligence Requested:
Market:

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

When the DB in AWS goes down, status-board handles the issue gracefully but the error messages aren't very clear.

Here's a snippet of the errors we see during a DB outage.

I0616 19:09:18.017207       1 logger.go:100] [opid=2ybQjdY9i5WUw4FT1VXu09R6LZ9] {"response_status":500,"elapsed":"29.972646005s"}
I0616 19:09:18.018625       1 logger.go:100] [opid=2ybQnKhFT9ZLj0NPcfyZMxGXwuo] {"request_method":"POST","request_url":"/api/status-board/v1/alertmanager-receiver","request_remote_ip":"127.0.0.1:49688"}
E0616 19:09:18.042650       1 logger.go:121] [opid=2ybQjbNaqBUy3sOZ1JjtNMu2D2P] OCM-SB-9: Unable to find Service with fullname='OSDv4/rosa-hcp-fleet-wide/ROSAHCPNodepoolUpgradeSuccess': context canceled

An improvement to this would be to catch Postgres related exceptions explicitly so we can change the error message to something more clear for a developer.

This could be something like "DB connection error: context canceled".

Assignee:: Matt Holder

Reporter:: Matt Holder

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Created:: 2025/06/17 2:09 PM

Updated:: 2025/06/30 5:29 PM

Details

Description

Attachments

Easy Agile Planning Poker

Activity

People

Dates