Home
🛡️ Safeguards - Safety toolkit to make LLMs reliable and safe
Safeguards Shield is a developer toolkit to use LLMs safely and securely. Our Shield safeguards prompts and LLM interactions from costly risks to bring your AI app from prototype to production faster with confidence.
Shield takes a three-prong approach:
- Safety Detection
- Groundedness
- Repair
We can work with you to integrate Shield into your enterprise LLM applications, build custom detectors for your workflow, and integrate with your favorite monitoring / SIEM tools.
What does Shield do?
Our Shield wraps your GenAI apps with a protective layer, safeguarding malicious inputs and filtering model outputs. Our comprehensive toolkit has 20+ out-of-the-box detectors for robust protection of your GenAI apps in workflow.
From prompt injections, PII leakage, DoS to ungrounded additions (hallucinations) and harmful language detection, our shield protects LLMs through a multi-layered defense. Lastly, we can work with you to create custom detectors.
Installation
$ SAFEGUARDS_API_KEY="******"
$ pip install safeguards-shield
Contact our team to get an API key and partner together to unlock value faster.
Usage
We have both an API that can be deployed as a Docker container in your VPC or our customizable Python SDK.
Book a demo with the Safeguards team to learn more.
Features
Safeguards is a Compound AI System where it orchestrates different components to work as a cohesive unit, where each component has their specializations and can collaborate towards a common safety objective.
Our default Shield covers most of the Detectors, except for the Auto-correctors for repairing LLM outputs which is for teams who are scaling. We will work with you to fine-tune our detectors and build custom detectors for your unique workflow.
- Input Safety Detectors
- Anonymize
- Bias
- DoS Tokens
- Malware URL
- Prompt Injections
- Secrets
- Stop Input Substrings
- Toxicity
- Harmful Moderation
- Coding Language *
coding language
required - Regex *
patterns
required
- Output Safety Detectors
- Bias
- Deanonymize
- Sensitive PII
- Stop Output Substrings
- Toxicity *
perspective_api_key
required - Harmful Moderation
- Regex *
patterns
required - Coding Language *
coding language
required - Relevance
- Groundedness
- Factual Consistency
- Contradictions
- Query-Context Relevance
- Output-Context Relevance
- Query-Output Relevance
- Text Quality
Multi-layered defense
- Deep Learning Detectors: detect and filter potentially harmful input before reaching LLMs
- LLM-based detectors: use our fine-tuned safety LLM to analyze inputs and outputs to identify potential safety and security risks
- Vector Database: store previous risks in a vector database to recognize and prevent similar risks
Customize
There is no one-size-fits-all guardrail detection to prevent all risks. This is why we encourage users to reach out and partner with us to fine-tune and customize our safeguards to cover safety areas relevant to your unique use cases.