Open source · MIT · Python ≥ 3.10
AIGuard
Evaluation, Testing, and Observability for Large Language Models.
AIGuard helps developers evaluate LLM applications before deployment and continuously monitor them in production through hallucination detection, adversarial testing, human review workflows, and trace analysis.
Install
pip install aiguard-safetyTest before deployment. Monitor after deployment.
Core capabilities
A focused set of building blocks for safer, more reliable LLM applications.
Hallucination Detection
Evaluate model outputs using structured classification: ground-truth verification, contextual grounding, and self-consistency analysis.
Read moreAdversarial Testing
Test models against prompt injection, jailbreak prompts, and instruction override attacks before deployment.
Read moreHuman Review
Escalate uncertain evaluations to a reviewer queue for manual validation and calibration feedback.
Read moreTrace Explorer
Inspect prompts, responses, latency, metadata, and evaluation results from every LLM call.
Read moreTwo stages, one toolkit
Run adversarial attack suites and hallucination checks against your model. Enforce thresholds in CI to block unsafe releases.
CI/CD integrationCapture traces from every call with the SDK. Continuously score for hallucinations and escalate uncertain cases to human review.
Monitoring dashboard