Uploaded image for project: 'Red Hat Offline Knowledge Portal'
  1. Red Hat Offline Knowledge Portal
  2. RHOKP-179

Spike: further optimize solr index size

XMLWordPrintable

    • Icon: Spike Spike
    • Resolution: Unresolved
    • Icon: Normal Normal
    • None
    • None
    • ETL, Performance, Search
    • None
    • True
    • Hide

      Solr 9.10 upgrade

      Show
      Solr 9.10 upgrade
    • False
    • Impediment
    • Offline Sprint 50, Offline Sprint 51, RHOKP Sprint 9, RHOKP Sprint 10, RHOKP Sprint 11, RHOKP Sprint 12, RHOKP Sprint 13, RHOKP Sprint 14, RHOKP Sprint 15, RHOKP Sprint 17

      There may be more opportunities for reducing the size of the solr index. The size of the index is nondeterministic and fluctuates +/- 150MB each time we generate it. Adding an optimization step after generating should be able to bring the size down, and there are other tweaks we can do to the data captured in the index that could reduce the size without compromising search quality. Here are some resources:

      https://dzone.com/articles/solr-index-size-analysis
      https://stackoverflow.com/questions/10080881/solr-index-size-reduction

      I tried the basic optimization curl http://localhost:8983/solr/portal/update?optimize=true command but it didn't have any effect; maybe the optimization is already in place (though I couldn't find anywhere it was being triggered).

      Jared suggested sending a commit request after the optimize request:

              // send GET request to solr to commit the changes
              self.client
                  .get("http://localhost:8983/solr/portal/update?commit=true")
                  .send()
                  .await
                  .unwrap();
      

              rhn-support-vmhaskar Vijay Mhaskar
              mclayton@redhat.com Michael Clayton
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated: