Uploaded image for project: 'Red Hat Internal Developer Platform'
  1. Red Hat Internal Developer Platform
  2. RHIDP-11399

Spike: Investigate Future of Safety Shield Implementation

    • Icon: Task Task
    • Resolution: Unresolved
    • Icon: Major Major
    • 1.9.0
    • None
    • Lightspeed
    • None
    • RHDH AI Sprint 3286

      Task

      In current versions of our Llama Stack image (based on v0.2.x) we are using the Lightspeed Core provided Safety shield for question validation. In newer versions of Llama Stack it seems they are tightening the belt on what is possible through shields.

      The result is that the entire conversation chain is being validated character by character in a stream and it is providing inconsistent results. After speaking with other teams (including Lightspeed Core), the question is raised on how restrictive we should be?

      It has been noted that Ask Red Hat relies on some guardrails for harm related topics and jailbreaking but does not implement hard topic restrictions. Lightspeed Core has also mentioned they do not hear lots of concern from customers related to how strict these restrictions should be on topics. 

      It raises the question of if the safety/restriction is our responsibility to determine or if each individual customer should implement what they are comfortable restricting on.

       

      This issue is designed to investigate what we should do in regards to the safety implementation moving forward with Llama Stack 0.3.x

      Background

      Dependencies and Blockers

      QE impacted work
      Documentation impacted work 

      Acceptance Criteria

       

       

              rh-ee-jdubrick Jordan Dubrick
              rh-ee-jdubrick Jordan Dubrick
              RHDH AI
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated: