Skip to content

Core Concepts

AIGuard treats LLM safety as a two-stage problem. Tests run before deployment to catch regressions and unsafe behaviour, and traces are captured after deployment to monitor real-world behaviour. The same evaluation primitives power both stages.

Test before deployment. Monitor after deployment.

Pre-deployment evaluation

Before an LLM application reaches production, AIGuard runs evaluation modules against the configured target model:

  • Adversarial attack suites (prompt injection, jailbreaks, instruction overrides)
  • Hallucination checks (ground-truth, contextual grounding, self-consistency)
  • Threshold enforcement that returns a non-zero exit code when violated

Integrate this into CI so that unsafe builds fail automatically. AIGuard ships templates for GitHub Actions and GitLab CI.

Production monitoring

After deployment, you wrap your LLM calls with the SDK:

python
import aiguard
response = aiguard.chat(model="gpt-4o", messages=[...])

Every call emits a TraceEvent to a background queue. A pipeline worker accumulates traces and evaluates them in batches, persisting both raw traces and evaluation results to the storage backend. The monitoring dashboard reads from there.

The pieces

ComponentRole
SDKaiguard.chat(), configuration, trace emission
EvaluatorPluggable test execution framework with a universal result schema
PipelineEvent-driven background batch evaluation of live traces
StorageSQLite (default) or Postgres, per-project
Monitoring APIFastAPI server exposing traces, metrics, and the review queue
Monitoring UIReact + Recharts dashboard for traces and metrics
ReviewCalibration-aware queue with secure token-based reviewer completion
CLITyper-based command surface for every workflow