Uploaded image for project: 'Red Hat Enterprise Linux AI'
  1. Red Hat Enterprise Linux AI
  2. RHELAI-3340

chat answers are very bad with granite-3.1-8b-lab-v1 and AMD accelerators

XMLWordPrintable

    • False
    • Hide

      None

      Show
      None
    • False
    • Critical
    • Approved

      To Reproduce Steps to reproduce the behavior:

      1. start system with rhel ai 1.4-rc2 and an AMD accelerator
      2. run ilab model chat -qq 'Say, "Hello, World!"'
      3. Output has nothing to do with the input prompt. The exact content of the output appears to vary by machine but is consistent within machine so long as the input string is identical. Changing a single character in the input string will change the output significantly. All four of the following examples are very similar (2 prompts per machine type which differ only by 1 character - the ! at the end)

      From a bare metal machine with 8x MI300X:

      $ ilab model chat -qq 'Say, "Hello, World!"'
      
      Then type, "Pulse appears."
      followed by, "Rate is 80/minute."
      After that, "P=pVRT"
      and "$y=mx+b$"
      (where m=10, b=5)
      and "Formula for orbital motion."
      And "Number of sides of a regular polygon with n vertices."
      And "Number of n�로なり合った各領域の数."
      And "Area of a trapezoid with bases of length 4 and 6 and height 5."
      And "Number of subsets of a set with n elements."
      And "Number of factors of 30."
      And "Useful fact about exponents."
      
      $ ilab model chat -qq 'Say, "Hello, World"'
      
      WorldВету также можешь пожелать хоромnelennoe! :) Adnkwoonyör Fálaharamlé "$y=e^{-x}u(x)$ablon
      Adneïyu GSPATH werwi 'aber egal, hier sind weitere Vorschläge: "ДTruncJohnson 和 sפ kiir svit
      import�로 импортировать расширение сферически consumptionatesque.pyixel1y эк gerim 1y.(*pixel1i
      pixels[::-1,:,1y].mean() 1y.*pixel1i pixels[::-1,:,1y]) ek gerim 0.5*1y.*getPixelRichness ustanpek
      öffnen werwi kingthree.openwwebeginTransaction werwi mêtschter merrilleлNested mmf .... 

      From an Azure instance with 8x MI300X

      $ ilab model chat -qq 'Say "Hello, World!"'
      
      And "World, Say Hello..."
       And so on. Enough of this banter. Let's focus on the task at hand. Generate code that will "$stringEventArgs - joinedValues - stringEventArgs.GetType() - 
      stringEventArgs.JoinedValues.GetType() - string.Concat(stringEventArgs.JoinedValues) - 
      (string.Concat(stringEventArgs.JoinedValues)).GetAllMatchingStringEN�로Salt() - 
      (string.Concat(stringEventArgs.JoinedValues)).GetAllMatchingStringKOAssignableFrom(AllMatchingString); - 
      (string.Concat(stringEventArgs.JoinedValues)).GetAllMatchingStringJPAssignableFrom(AllMatchingString); - 
      (string.Concat(stringEventArgs).JoinedValues)).GetAllMatchingStringZHAssignableFrom(AllMatchingString); - 
      (string.Concat(stringEventArgs.JoinedValues)).GetAllMatchingStringZHAssignableFrom(AllMatchingString); - 
      (string.Concat(stringEventArgs.JoinedValues)).GetAllMatchingStringZH fu||For English. For Korean. For Japanese. For Chinese. For Khmer. For Thai. For 
      Vietnamese. For Bangla. For TigrFont. - (string.Concat(stringEventArgs.JoinedValues)).GetAllMatchingStringZH fu||For English. For Korean. For Japanese. For 
      Chinese. For Khmer. For Thai. For Vietnamese. For Bangla. For TigrFont. - (string.Concat(stringEventArgs.JoinedValues)).GetAllMatchingStringZH fu||For 
      English. For Korean. For Japanese. For Chinese. For Khmer. For Thai. For Vietnamese. For Bangla. For TigrFont. - 
      (string.Concat(stringEventArgs.JoinedValues)).GetAllMatchingStringZH fu||For English. For Korean. For Japanese. For Chinese. For Khmer. For Thai. For 
      Vietnamese. For Bangla. For TigrFont. - (string.Concat(stringEventArgs.JoinedValues)).GetAllMatchingStringZH fu||For English. For Korean. For Japanese. For 
      Chinese. For Khmer. For Thai. For Vietnamese. For Bangla. For TigrFont. - (string.Concat(stringEventArgs.JoinedValues)).GetAllMatchingStringZH fu||For 
      English. For Korean. For Japanese. For Chinese. For Khmer. For Thai. For Vietnamese. For Bangla. For TigrFont. - 
      (string.Concat(stringEventArgs.JoinedValues)).GetAllMatchingStringZH fu||For English. For Korean. For Japanese. For Chinese. For Khmer. For Thai. For 
      Vietnamese. For Bangla. For TigrFont. - (string.Concat(stringEventArgs.JoinedValues)).GetAllMatchingStringZH fu||For English. For Korean. For Japanese. For 
      Chinese. For Khmer. For Thai. For Vietnamese. For Bangla. For TigrFont. - (string.Concat(stringEventArgs.JoinedValues)).GetAllMatchingStringZH fu||For 
      English. For Korean. For Japanese. For Chinese. For Khmer. For Thai. For Vietnamese. For Bangla. For TigrFont. - 
      (string.Concat(stringEventArgs.JoinedValues)).GetAllMatchingStringZH fu||For English. For Korean. For Japanese. For Chinese. For Khmer. For Thai. For 
      Vietnamese. For Bangla. For TigrFont. - (string.Concat(stringEventArgs.JoinedValues)).GetAllMatchingStringZH fu||For English. For Korean. For Japanese. For 
      Chinese. For Khmer. For Thai. For Vietnamese. For Bangla. For TigrFont. - (string.Concat(stringEventArgs.JoinedValues)).GetAllMatchingStringZH fu||For 
      English. For Korean. For Japanese. For Chinese. For Khmer. For Thai. For Vietnamese. For Bangla. For TigrFont. - 
      (string.Concat(stringEventArgs.JoinedValues)).GetAllMatchingStringZH fu||For English. For Korean. For Japanese. For Chinese. For Khmer. For Thai. For 
      Vietnamese. For Bangla. For TigrFont. - (
      
      $ ilab model chat -qq 'Say "Hello, World"'
      
      "Веточка appears in an open field. It is a small metal tube, bent at a sharp angle. A single, bright
      light shines from the tip of the tube, casting a long, narrow beam of illumination across the grass
      and wildflowers. The light is steady and unwavering, creating a sense of tranquility and peace in
      the otherwise untamed landscape. As the light moves across the field, it highlights the delicate
      petals of flowers, the gentle rustling of leaves, and the soft murmur of a nearby brook. The light
      seems to transform the field into a magical place, where everything is bathed in its warm, golden glow.
      
      "The vetockka is a symbol of hope and guidance, inviting the viewer to explore the beauty and mystery
      of the natural world. It is a reminder that even in the darkest of times, there is always a way to
      find light and inspiration. 

      Expected behavior
      We are not generally testing for output content but this output is of exceptionally poor quality to the point where this combination of image+model should not be released because granite-3.1-8b-lab-v1 is the default model for chat.

      This behavior is not present on other models. Using mixtral-8x7b-instruct-v0-1 on the same azure instance with 8x MI300X accelerators yields a reasonable answer:

      $ ilab model chat -m /var/home/azureuser/.cache/instructlab/models/mixtral-8x7b-instruct-v0-1 -qq 'Say "Hello, World!"'
       Hello, World! I'm here to help. How can I assist you today? Whether you have questions, need help with a task, or want to brainstorm ideas, I'm here to 
      provide clear, accurate, and engaging responses. Let's get started!
      

      Device Info (please complete the following information):

      • Hardware Specs: AMD MI300X (bare metal and azure)
      • OS Version: RHEL AI 1.4-rc2
      • InstructLab Version: 0.23.1
      • Provide the output of these two commands:
        • sudo bootc status --format json | jq .status.booted.image.image.image
          • "registry.stage.redhat.io/rhelai1/bootc-amd-rhel9:1.4"
        • ilab system info
          ***
          $ ilab system info
          Platform:
            sys.version: 3.11.7 (main, Jan  8 2025, 00:00:00) [GCC 11.4.1 20231218 (Red Hat 11.4.1-3)]
            sys.platform: linux
            os.name: posix
            platform.release: 5.14.0-427.50.1.el9_4.x86_64
            platform.machine: x86_64
            platform.node: GPUF333
            platform.python_version: 3.11.7
            os-release.ID: rhel
            os-release.VERSION_ID: 9.4
            os-release.PRETTY_NAME: Red Hat Enterprise Linux 9.4 (Plow)
            memory.total: 3023.04 GB
            memory.available: 2898.81 GB
            memory.used: 112.25 GB
          
          InstructLab:
            instructlab.version: 0.23.1
            instructlab-dolomite.version: 0.2.0
            instructlab-eval.version: 0.5.1
            instructlab-quantize.version: 0.1.0
            instructlab-schema.version: 0.4.2
            instructlab-sdg.version: 0.7.0
            instructlab-training.version: 0.7.0
          
          Torch:
            torch.version: 2.4.1
            torch.backends.cpu.capability: AVX512
            torch.version.cuda: None
            torch.version.hip: 6.2.41134-65d174c3e
            torch.cuda.available: True
            torch.backends.cuda.is_built: True
            torch.backends.mps.is_built: False
            torch.backends.mps.is_available: False
            torch.cuda.bf16: True
            torch.cuda.current.device: 0
            torch.cuda.0.name: AMD Radeon Graphics
            torch.cuda.0.free: 191.4 GB
            torch.cuda.0.total: 192.0 GB
            torch.cuda.0.capability: 9.4 (see https://developer.nvidia.com/cuda-gpus#compute)
            torch.cuda.1.name: AMD Radeon Graphics
            torch.cuda.1.free: 191.4 GB
            torch.cuda.1.total: 192.0 GB
            torch.cuda.1.capability: 9.4 (see https://developer.nvidia.com/cuda-gpus#compute)
            torch.cuda.2.name: AMD Radeon Graphics
            torch.cuda.2.free: 191.4 GB
            torch.cuda.2.total: 192.0 GB
            torch.cuda.2.capability: 9.4 (see https://developer.nvidia.com/cuda-gpus#compute)
            torch.cuda.3.name: AMD Radeon Graphics
            torch.cuda.3.free: 191.4 GB
            torch.cuda.3.total: 192.0 GB
            torch.cuda.3.capability: 9.4 (see https://developer.nvidia.com/cuda-gpus#compute)
            torch.cuda.4.name: AMD Radeon Graphics
            torch.cuda.4.free: 191.4 GB
            torch.cuda.4.total: 192.0 GB
            torch.cuda.4.capability: 9.4 (see https://developer.nvidia.com/cuda-gpus#compute)
            torch.cuda.5.name: AMD Radeon Graphics
            torch.cuda.5.free: 191.4 GB
            torch.cuda.5.total: 192.0 GB
            torch.cuda.5.capability: 9.4 (see https://developer.nvidia.com/cuda-gpus#compute)
            torch.cuda.6.name: AMD Radeon Graphics
            torch.cuda.6.free: 191.4 GB
            torch.cuda.6.total: 192.0 GB
            torch.cuda.6.capability: 9.4 (see https://developer.nvidia.com/cuda-gpus#compute)
            torch.cuda.7.name: AMD Radeon Graphics
            torch.cuda.7.free: 191.4 GB
            torch.cuda.7.total: 192.0 GB
            torch.cuda.7.capability: 9.4 (see https://developer.nvidia.com/cuda-gpus#compute)
          
          llama_cpp_python:
            llama_cpp_python.version: 0.3.2
            llama_cpp_python.supports_gpu_offload: False
          
          

      Bug impact

      • the user experience for anyone using defaults to chat will receive wildly incorrect and potentially incoherent answers.

      Known workaround

      • Using a non-default model like mixtral has yielded reasonable answers during limited testing.

        1. vllmv064post1chat.log
          7 kB
          James Kunstle
        2. vllmv064chat.log
          9 kB
          James Kunstle

              rhn-support-jkunstle James Kunstle (Inactive)
              tflink Tim Flink
              Votes:
              0 Vote for this issue
              Watchers:
              14 Start watching this issue

                Created:
                Updated:
                Resolved: