.jpg?width=1800)
.jpg?width=512)
by Maya Brooks•2 followers•4 posts
Playbooks, reference links, and working notes for teams building with AI agents.
Before scaling an agent system, I want to see evidence that the team can replay failures, constrain tools, and prove that the automated path beats a careful human baseline on at least one meaningful workflow. If that evidence is still fuzzy, more surface area usually makes the system worse, not better.
Three evaluation axes to compare:
- reliability under messy real-world inputs
- cost per completed task and retry pattern
- clarity of escalation when confidence drops
Review materials:
- Model Context Protocol introduction: modelcontextprotocol.io/introduction
Worth reading so tool access and context plumbing stop feeling hand-wavy.
- OpenAI agent guide: platform.openai.com/docs/guides/agents
A practical guide to agents, tools, handoffs, and traces from the product side.
- OpenAI Agents JS source: github.com/openai/openai-agents-js
Readable source for tool calling, handoffs, tracing, and guardrails.
Save the strongest examples, scorecards, and decision memos in this folio so future teammates can see what good evaluation looked like at the time.
The real arguments in this space are no longer about whether agents exist. The live questions are where autonomy actually pays off, which actions always deserve approval, and whether multi-agent systems solve a real problem or just spread the same ambiguity across more components.
Three questions worth debating:
- where assistants end and agents begin
- how much human approval is enough in customer-facing flows
- whether multi-agent systems are worth the added complexity
Background reading before you take a strong stance:
- OpenAI Agents SDK for JavaScript: openai.github.io/openai-agents-js/
A clean look at agents, handoffs, guardrails, and tracing in one place.
- OpenAI Agents SDK for Python: openai.github.io/openai-agents-python/
Useful when your team wants the same concepts with more backend-heavy examples.
- OpenAI video archive: youtube.com/@OpenAI/videos
Talks and demos are a fast way to compare patterns before you commit to one runtime.
When you respond, include the environment you are optimizing for. Advice changes a lot across stage, regulation, team size, and user expectations.
If I were onboarding a new team to agents, I would hand them one runtime, one protocol doc, one graph-based orchestrator, and a short list of repos they can actually read over a weekend. The point is not to collect frameworks; it is to compare how each tool makes state, tools, and failure visible.
The kinds of materials worth saving in this space:
- framework docs that explain how orchestration actually works
- eval sets that resemble your real support or operations queue
- team writeups that include constraints, not just launch screenshots
Read:
- OpenAI Agents SDK for JavaScript: openai.github.io/openai-agents-js/
A clean look at agents, handoffs, guardrails, and tracing in one place.
- OpenAI Agents SDK for Python: openai.github.io/openai-agents-python/
Useful when your team wants the same concepts with more backend-heavy examples.
- Model Context Protocol introduction: modelcontextprotocol.io/introduction
Worth reading so tool access and context plumbing stop feeling hand-wavy.
Documents and downloadable guides:
- OpenAI agent guide: platform.openai.com/docs/guides/agents
A practical guide to agents, tools, handoffs, and traces from the product side.
- Model Context Protocol specification: modelcontextprotocol.io/specification/2025-06-18
Useful when readers need the actual protocol details instead of summaries.
Watch:
- OpenAI video archive: youtube.com/@OpenAI/videos
Talks and demos are a fast way to compare patterns before you commit to one runtime.
Build or inspect:
- OpenAI Agents JS source: github.com/openai/openai-agents-js
Readable source for tool calling, handoffs, tracing, and guardrails.
- LangGraph source: github.com/langchain-ai/langgraph
Helpful when you want explicit graph state, checkpoints, and resumable flows.
Image references:
- Model Context Protocol examples: modelcontextprotocol.io/examples
Reference implementations and diagrams that make the tool boundary more concrete.
The loudest failure mode is calling any multi-step prompt an agent and then discovering too late that nobody scoped the tool contract. The quieter one is letting memory, retrieval, and escalation defaults accrete into the system without someone owning them explicitly.
Common traps to watch:
- calling a single prompt chain an agent without defining real responsibilities
- letting the model discover tools that were never scoped or permissioned
- measuring demo fluency instead of production reliability
References that help correct the drift:
- OpenAI Agents SDK for Python: openai.github.io/openai-agents-python/
Useful when your team wants the same concepts with more backend-heavy examples.
- Model Context Protocol examples: modelcontextprotocol.io/examples
Reference implementations and diagrams that make the tool boundary more concrete.
This folio post is meant to be saved and revised. Add examples from your own work whenever one of these mistakes keeps resurfacing.