Details
-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
None
-
None
-
False
-
False
-
Not Started
-
Not Started
-
Not Started
-
Not Started
-
Not Started
-
Not Started
Description
Hi,
Since September 12 we are detecting that sometimes our nightly runs which use master components fail to deploy 3scale correctly.
This has happened on:
September 12
September 14
September 15
What we observe is that some expected routes are not being created.
When showing the routes this are the ones that we can see:
circleci@default-7963dcc9-865b-471c-8e8f-7d7bda4d8c69:~$ oc get routes
NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD
backend backend-3scale.lvh.me backend-listener http edge/Allow None
zync-3scale-api-kcmt4 api-3scale-apicast-production.lvh.me apicast-production gateway edge/Redirect None
zync-3scale-api-xvscq api-3scale-apicast-staging.lvh.me apicast-staging gateway edge/Redirect None
Notice how not all of them are there: There is no master route nor the admin and developer routes for the default tenant.
I set this issue as zync and system components because I understand those are the components that manage the routes. Feel free to update the components in Jira.
I can see all the 3scale pods are there and they are all running:
circleci@default-7963dcc9-865b-471c-8e8f-7d7bda4d8c69:~$ oc get pods
NAME READY STATUS RESTARTS AGE
apicast-production-1-7cv5n 1/1 Running 0 34m
apicast-staging-1-74fgm 1/1 Running 0 34m
backend-cron-1-665lq 1/1 Running 0 35m
backend-listener-1-pznls 1/1 Running 0 35m
backend-redis-1-wl5bx 1/1 Running 0 35m
backend-worker-1-7tls2 1/1 Running 0 35m
system-app-1-z59xx 3/3 Running 0 32m
system-memcache-1-hm9lv 1/1 Running 0 35m
system-mysql-1-rbpwz 1/1 Running 0 35m
system-redis-1-sq6d7 1/1 Running 0 35m
system-sidekiq-1-jnbj5 1/1 Running 0 34m
system-sphinx-1-bd77r 1/1 Running 0 35m
zync-1-q667l 1/1 Running 0 34m
zync-database-1-x2cd7 1/1 Running 0 34m
zync-que-1-rsntv 1/1 Running 1 34m
I also attach the logs of all the deployed pods and the logs of the K8s events in the namespace where the pods were deployed so the issue can be investigated.
The most recent CircleCI job where this has been detected is: https://app.circleci.com/pipelines/github/3scale/3scale-operator/3825/workflows/7c489eb8-fe01-4ef5-847d-b71e40ae2fbd/jobs/28211