The Good, The Bad, and the Learning
The Good - What went well
New Domain
Since Generative AI is a relatively new field in the commercial sector, it has a set of relatively new problems. This includes the primary problem that PropScreen addresses, Unauthorized Sensitive Information Disclosure from an LLM. The novelty of the problem set provides opportunity to develop novel approaches to the novel problems.
Existing Tools
A surprising discovery was the level of maturity in the field of open-source output sanitization and sensitive information detection for models. Some examples of tools discovered during development were:
Named Entry Recognition Models to aid in sensitive information detection including:
These tools served as an excellent baseline and examples for what sensitive information detection occurs. The open-source nature of these tools allowed the team to quickly deploy and experiment on what approach would be the most appropriate to address the problem of unauthorized sensitive information disclosure.
The Bad - Main Challenges
Performance Times
Propscreen utilizes a NER model natively in the first check when scanning for general sensitive information, this process is computationally costly and takes longer than the team would like in order to complete these checks.
As of now the planned solution is to test alternative models and scanning utilities.
Size of the Application
The LLM Guard library requires a tremendous amount of imports from libraries that are not exactly space efficient. For example in order for LLM Guard and her functions to run, they require the NVIDIA library for Python 3.
The glimmer of hope when addressing this solution is the fact the LLM Guard is an open source solution and by it's nature very large because it seeks to address many LLM vulnerabilities aside from sensitive information disclosure. A solution would be (if their license allows it) to fork the Sensitive Scan library and refactor the code such that this set of tools can work independently from the solution as a whole, reducing the size of the solution.
LLM Hallucinations and Formatting
A unique challenge of unauthorized disclosure of sensitive information with relation to generative AI is the fact that the model has the ability behave in two distinct ways that differ from disclosure form a traditional database:
Hallucination - the model may provide factually false responses or responses that are nonsense to humans. While PropScreen's methodology still works on responses that are hallucinations, they can frustrated the parties involved. During development LLM Hallucinations frustrated the testing process.
Formatting - a generative model may respond with sensitive information with a level of entropy to it such that is is not longer exactly the sensitive information, but can be interpreted as sensitive information. Some examples of this are misspellings of words and badly formatted responses. For example the response "Jorge Wash en tin" is not the same as "George Washington" however the reader was probably able to discern the meaning of the first string through their intuition.
The most comprehensive and dynamic solution in order to address these problems can be found in the wider-breadth section.
The Learning - Lessons Learned
Use Cases as a Fulcrum
Generative AI security is a huge problem, the prevention of unauthorized sensitive information disclosure is a huge problem with many ways in which solutions can be implemented. Getting lost in either analysis paralysis or feature creep was a problem that the team endured during the early weeks of development. By clearly defining the use cases of the solution, all efforts could be focused on solving the problem by addressing the use cases. This principle provided a center of gravity for the team and prevented time lost due to speculation, constant reassessment, or feature creep. Determining a primary use case also allowed the team to rank which features were the most urgent and which could be shelved in the interest of time if necessary.
The need for Agility
The adoption of an AGILE mindset was critical to the success of the project. This was especially true for this project as it is in a domain which is seeing rapid development and production of tools, methods, and solutions that address Generative AI Security. By using AGILE, the team was able to quickly determine what was working and what wasn't working, pivot, and continue their progress with the minimum amount of time and effort lost on efforts that were no longer needed for the project.
Last updated