Loading...

XML

Word

Printable

Type: Bug
Resolution: Unresolved
Priority: Critical
Fix Version/s: None
Affects Version/s: None
Component/s: Content
Labels:
None

AssignedTeam:
insights-content

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

Release Blocker:
None

PX Impact Score:

Currently only two of the 4 worker pods are processing tasks. I'm not sure if this was related to the change to postgres down to 20 gb in production, but it seems that it started around the same time:

we saw this error:

46PM ERR Connect database=postgres err="failed to connect to `user=postgres database=postgres`: 10.0.217.176:5432 (content-sources-prod.ckxxru2ayexw.us-east-1.rds.amazonaws.com): dial error: timeout: context deadline exceeded" host=content-sources-prod.ckxxru2ayexw.us-east-1.rds.amazonaws.com module=pgx port=5432 1:46PM ERR Acquire err="failed to connect to `user=postgres database=postgres`: 10.0.217.176:5432 (content-sources-prod.ckxxru2ayexw.us-east-1.rds.amazonaws.com): dial error: timeout: context deadline exceeded" module=pgx panic: error connecting to database: failed to connect to `user=postgres database=postgres`: 10.0.217.176:5432 (content-sources-prod.ckxxru2ayexw.us-east-1.rds.amazonaws.com): dial error: timeout: context deadline exceeded goroutine 117 [running]: github.com/content-services/content-sources-backend/pkg/tasks/queue.(*PgQueue).waitAndNotify(0xc0000acc80, {0x1e36848, 0xc00012e910}) /go/src/app/pkg/tasks/queue/pgqueue.go:303 +0x2d3 github.com/content-services/content-sources-backend/pkg/tasks/queue.(*PgQueue).listen(0xc0000acc80, {0x1e36848, 0xc00012e910}, 0x0?) /go/src/app/pkg/tasks/queue/pgqueue.go:279 +0x45 created by github.com/content-services/content-sources-backend/pkg/tasks/queue.NewPgQueue in goroutine 1 /go/src/app/pkg/tasks/queue/pgqueue.go:267 +0x225

i think around when the db was changed.

The two pods are just showing:

7:41PM INF Query args=[] commandTag=BEGIN module=pgx pid=26646 sql=begin
7:41PM INF Query args=[] commandTag=LISTEN module=pgx pid=26636 sql="LISTEN tasks"
7:41PM INF Query args=[] commandTag=BEGIN module=pgx pid=26667 sql=begin
7:41PM INF Query args=[] commandTag=UNLISTEN module=pgx pid=26637 sql="UNLISTEN tasks"
7:41PM INF Query args=[] commandTag=BEGIN module=pgx pid=26637 sql=begin
7:41PM INF Query args=[] commandTag=UNLISTEN module=pgx pid=26636 sql="UNLISTEN tasks"
7:41PM INF Query args=[] commandTag=LISTEN module=pgx pid=26636 sql="LISTEN tasks"

I had app-sre restart both pods, and they initially picked up work, but then stopped again pretty quickly

Assignee:: Unassigned

Reporter:: Justin Sherrill

Contributors:: None

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Created:: 2025/08/08 8:52 PM

Updated:: 2025/08/08 8:52 PM

Details

Description

Attachments

Easy Agile Planning Poker

Activity

People

Dates