Privacy Architecture

GPT Lab has a robust privacy architecture to ensure sensitive and personal information is not sent to third-party LLM AI providers. Here are the key points:

📊 Data Isolation

GPT Lab is designed to be self-hosted, meaning all data processing and storage happens within our own infrastructure. This allows us to keep our and your data completely isolated and prevent any leakage to external parties.

🛠️ Local LLM Support

GPT Lab supports running open-source LLMs like Llama, Mixtral and Falcon locally on our own hardware (GPU or CPU). This enables a truly air-gapped system where no data ever leaves our premises. This is possible via support of Ollama, GPT4All and FastChat servers.

☁️ Cloud LLM Integration

When using a cloud LLM provider like Replicate, TogetherAI or HuggingFace, GPT Lab can integrate with them. These providers do not retain any data, ensuring your information remains private.

🕵️‍♂️Anonymization of personally identifiable information (PII)

In our self-hosted community version, when dealing with personal information, it is also possible to anonymize data before ingestion to the vector DB and in combination we add an "Anonymize Scanner" (such as LLM Guard and/or Microsoft Presidio) which acts as our digital guardian, ensuring that user prompts remain confidential and free from sensitive data exposure, before sending the prompt data to a third-party cloud LLM provider. This step is not necessary when self-hosting LLM.

📝Summary

In summary, GPT Lab's self-hosted nature, local LLM support, cloud integration options without data retention, and anonymization of personal information provide a robust privacy architecture to prevent sensitive data exposure.