Uploaded image for project: 'Red Hat Enterprise Linux AI'
  1. Red Hat Enterprise Linux AI
  2. RHELAI-4306

vLLM max_startup_attempts: 120 is too low for Gaudi HPU

XMLWordPrintable

    • False
    • Hide

      None

      Show
      None
    • False

      ilab model chat times out while waiting for vLLM to start up. Bumping from the default 120 to something like 320 is necessary. The usual retry count varies around 280 attempts.

      eg.

      serve.vllm.max_startup_attempts: 320

              rh-ee-jlarkin Justin Larkin
              fzatlouk@redhat.com FrantiĊĦek Zatloukal
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated: