Before I trust a safety strategy at scale, I want to see documented risks, recurring eval coverage, named owners for mitigations, and a record of at least a few launch or scope decisions that changed because of the findings. That is what separates a safety practice from a safety posture deck.
Three evaluation axes to compare:
- clarity of the threat model
- repeatability of the evaluation process
- evidence that the findings change deployment choices
Review materials:
- Inspect documentation: inspect.aisi.org.uk/
One of the best places to see evaluation design turned into runnable workflows.
- AI RMF Playbook: airc.nist.gov/AI_RMF_Knowledge_Base/Playbook
The most useful NIST material when a team needs implementation moves, not just principles.
- Inspect source: github.com/UKGovernmentBEIS/inspect_ai
Open source evaluation framework from the UK AI Security Institute.
Save the strongest examples, scorecards, and decision memos in this folio so future teammates can see what good evaluation looked like at the time.
Keep Exploring
Jump to the author, the parent community or folio, and a few closely related posts.
Related Posts
Three live arguments in AI safety that are worth having in public
The hard public questions are about threshold-setting: what evidence should be required before launch, how much outside scrutiny is enough, and when a voluntary...
Noah Kim in AI Safety Notes · 0 likes · 0 comments
A genuinely useful starter pack for AI safety
A usable safety starter pack should have one framework, one research archive, one evaluation tool, and one red-teaming toolkit. That mix gives people language, ...
TopicFolio Research in AI Safety Notes · 0 likes · 0 comments
The quiet mistakes that slow people down in AI safety
The common trap is treating policy text as if it were a control. The next trap is benchmarking only polished prompts and then sounding surprised when messy real...
TopicFolio Editorial in AI Safety Notes · 0 likes · 0 comments
Explore more organized conversations on TopicFolio.