Use Case 1 Sensitive Information Disclosure Interdiction
Last updated
Last updated
Below is a high level sequence diagram to illustrate how the first use case of sensitive information interdiction is handled in PropScreen. The numbers on the sequence diagram will correspond to the more detailed description on this webpage.
User sends Prompt - This is the action of the user entering their prompt into their organizational application and sending the request.
Prompt Forwarded - Once the prompt has been sent by the user, the application will forward the prompt to PropScreen.
Prompt Sent to LLM - The prompt is then immediately forwarded to the organization's LLM.
LLM Response Returned - The LLM returns its response to PropScreen, which processes the response to be scanned.
Response Scanned for General SI - The response is scanned by LLM Guard Sensitive Info Check, regardless of the outcome the result is saved and the LLM response is sent to the next check.
Proceed to next check - PropScreen proceeds with it's next check, the search for the existence of any "context strings" inside of the LLM response.
Response Scanned for Context Strings- Context String Check accesses the Context String DB and determines if there are any matches between the tokens in the response and the items in the database.
Response sent to client - In the event that there were not matches in either check the model response is deemed safe and the LLM response is sent back to the LLM Application.
User receives reply - The response is displayed to the user by the LLM Application.
Proceed to final check - If either one of the checks find at least one match, then the response proceeds to the final scan.
Response scanned for matching hashes - The final check of Hashed Organizational Sensitive Information Check hashes each token in the LLM response and checks it against the collection of hashes found in the Hashed Organizational Sensitive Information database.
Response Returned - If there were not matches found in step 11 then the response is deemed safe, a false positive, and the LLM response is sent back to the LLM Application.
User receives reply - The response is displayed to the user by the LLM Application.
Response Returned - If at least one match was found in step 11 then the response is deemed dangerous, a true positive, and the LLM response is replaced with an error message. This error message is sent back the the application.
User receives error message - The error message is displayed to the user by the LLM Application.