Hashed Organizational Sensitive Information

The Hashed Organizational Sensitive Information database is where the hashes of the organization sensitive information that the organization wants checked in the LLM responses are stored. As the name suggests, the data in Hashed Organizational Sensitive Information should only be hashes of the strings that the company wants to check for and never the data itself in the clear text (rationale for this can be found here).

The schema for Hashed Organizational Sensitive Information is one table with a single column of the hashes. Below is an example of how the schema looks:

si_hash

410943463d1786da4b258d5113a29d3dd7119ea86002729c27482c5ad9d4150d

b1b2fc3f32e4b1d48adb45270e3265a0a7a429d3d94ab1f96a576463b03759a2

317c96b8eada2d689086708d341bb4dce4ee833177a2ffa76a5a0e781fa7f03e

Note that in this example and in the PropScreen proof of concept, SHA256 was used to hash the data, however this is not a strict requirement. Additionally in a live setting, there would need to be congruency between the hash salting and peppering techniques that could be employed. This secure by design feature would be necessary to harden Hashed Organizational Sensitive Information in the case it falls subject to a hash cracking, dictionary, or rainbow table attacks. For the purposes of proof of concept, neither salting nor peppering were deployed, but the team would like to acknowledge the importance of such practices. A graceful way to deploy these techniques would be one of the next steps in the development of the project.

Last updated