Issues
With big database, Sphinx offline indexing requires:
- Resource consumption: at least 4 CPU and 16 GB RAM
- Time: The bigger the data, the more it takes time to reindex
- Cron: a job needs to be scheduled to reindex
On OCP, the sphinx indexer is launched inside a Thread,
https://github.com/3scale/porta/blob/master/lib/tasks/openshift.rake#L38-L86
Meaning that whenever we deploy a pod, a full reindex is done
Dev notes
https://freelancing-gods.com/thinking-sphinx/v3/real_time.html
- Realtime indexing will remove the dependency on Cron job or the background thread
- Realtime indexing will allow asynchronous indexation by sidekiq to leverage the web unicorn worker.
Testing procedure
- system-sidekiq should be running
- system-app should be running
- system-sphinx should be running
Create or modify an Account or CMS page
Before: They can be searched after 30 minutes
After: They can be searched immediately after the sidekiq job SphinxIndexationWorker has been running.
h3. Upgrade procedure
From 2.8 to 2.9
Delete the content of system-sphinx-database volume
Redeploy the system-sphinx pod
Reindex launching the tak rake sphinx:enqueue
- is related to
-
THREESCALE-5264 Investigate long-term replacement of Sphinx
- Closed
-
THREESCALE-3686 System SaaS migration to OCP
- To Develop