🗃️
PropScreen
  • PropScreen Overview
  • Introduction
    • The Problem
    • Problem Importance and Impact
    • Solution Overview
      • How Propscreen Addresses the Problem
      • Use Cases
      • User Personas
  • System Architecture
    • Core Components of Propscreen
      • PropScreen's Checks
      • Context Strings Database
      • Hashed Organizational Sensitive Information
      • Interdiction Log Database (Reports)
    • Architecture Overview
    • Sequence Diagrams by Use Case
      • Use Case 1 Sensitive Information Disclosure Interdiction
      • Use Case 2 Logging of Interdiction Events
  • Project Considerations
    • Threat Modeling
      • Key Threats
      • Key Threat 1
      • Key Threat 2
      • Key Threat 3
    • Secure by Design
    • Alternative Solutions
      • NER and Regex Based Scans
      • Traditional Data Loss Prevention
    • PropScreen's LLM Implementation
  • The Proof of Concept
    • Demo Video
    • Try the Proof of Concept
  • Going Forward...
    • SIEM Integration
    • Role Based Access Control Dependent Response Filtering
    • The Good, The Bad, and the Learning
Powered by GitBook
On this page
  • Overview
  • Benefits
  • Speed
  • Wider Breadth
  • Considerations
  • Access to Sensitive Data
  1. Project Considerations

PropScreen's LLM Implementation

PreviousTraditional Data Loss PreventionNextDemo Video

Last updated 9 months ago

Overview

An alternative architecture has been proposed where instead of a series of checks, PropScreen would instead itself be a language model that has been given the context of the organizational sensitive information and has been tasked with reading the responses of the organization's LLM and determining whether or not the responses contain organizational sensitive information. PropScreen's LLM would not directly interact with the LLM user, but make a determination on the existence of organization sensitive information a the Company LLM's response. The determination would then dictate the action that PropScreen takes in the .

Benefits

Speed

This approach would subject the organization's LLM responses to a single check instead of a series of three checks. While still in speculation, it is likely that the single check that the PropScreen model would perform would be faster that the sum of the NER model, regex, and two checks against databases that PropScreen currently utilizes.

Wider Breadth

A language model scanning for organization sensitive information being disclosed inside of responses could be more dynamic in its detection of sensitive information. The primary reason being is that a model would be able to detect organizational data that has been misformatted or obfuscated in some way. For example, if client IDs are considered sensitive information and one such ID is "ZB-743223", a hashed check would fail against a response that contains "ZB -743 223" - a model has the possibility of identifying this correctly as the client ID "ZB-743223" and blocking it.

Considerations

Access to Sensitive Data

This approach would require PropScreen to have access to the organization's sensitive information in the clear text. While this is not prohibitive for the solution, the PropScreen's model knowing organizational sensitive information should be considered when determining the tolerable levels of risk for this type of implementation.

first use case
High Level Architecture of the LLM Based Approach to PropScreen