Tools for trustworthy AI

Evaluations, provenance, and reproducibility — starting with a small, auditable RAG benchmark.

Open TrustEval-Mini Leaderboard How to submit

What is TrustEval-Mini?

A compact RAG evaluation for faithfulness + attribution on a fixed manifest of sources/questions.

Leaderboard · GitHub · Submission guide

Provenance modes

  • DERIVED — Quick start. Uses the published manifest + hash.
  • STRICT — Cryptographically verified. We sign the manifest; evaluator verifies with our public key. Reports include signature_verified: true when authentic.

How scores get on the board

Run locally → produce report.json/report.html → add one line to scores.csv → open a Pull Request. CI verifies the numbers match your report before merge.

What’s next

More utilities for data lineage, red-teaming, and reproducible runs. Collaborations welcome.

research@theoyez.org