8 modules · 2 days

What you'll learn.

Day 1 covers agent architecture fundamentals and core production patterns. Day 2 goes deep on multi-agent systems, output reliability, evals, and observability.

Day 1 Agent foundations & production patterns

LangGraphPydantic AIsmolagents

Agent Architecture Patterns

ReAct, Plan-Execute, LATS, and reflection loops — when each pattern applies, what breaks them in production, and how to choose the right architecture before you write a line of code.

Anthropic tool useOpenAI function callingMCP

Tool Use & Function Calling at Depth

Beyond hello-world tool calls: parallel tool invocation, tool call failure recovery, tool output validation, streaming tool results, and defensive patterns for tools that return garbage.

QdrantpgvectorLangMem

Memory Architecture

Short-term (in-context), long-term (vector + structured store), episodic, and semantic memory — how to design memory that scales across long-running agent sessions without bloating context.

Prompt cachingContext compactionToken budgets

Context Window Management

The most underrated production problem. Compaction strategies, selective summarisation, KV cache reuse, structured truncation — how agents stay coherent over thousands of turns.

Day 2 Multi-agent systems, evals & production readiness

CrewAIAutoGen / AG2LangGraph multi-agent

Multi-Agent Orchestration

Supervisor-worker, peer-to-peer, and pipeline topologies for multi-agent systems. State passing between agents, shared memory design, and avoiding the classic failure: agents talking in circles.

InstructorPydanticGuardrails AI

Structured Outputs & Guardrails

Forcing reliable JSON from any LLM, input/output guardrails that don't kill performance, domain-specific validation, and building self-correcting agents that catch their own schema errors.

RAGASLangSmithcustom eval frameworks

Eval Harness Design

How to write evals that actually catch failures before users do — LLM-as-judge patterns, trajectory evaluation, regression test suites for agents, and CI/CD integration for AI systems.

LangSmithArize PhoenixOpenTelemetry

Production Debugging & Observability

Tracing agent execution, identifying where reasoning goes wrong, latency profiling, cost attribution across agent chains, and building dashboards that make silent failures visible.

What you walk away with

Three deliverables,
not just takeaways.

Every participant leaves with working code and reusable frameworks — not just slides.

Working agent codebase

Each team ships a working multi-agent system during the workshop — with memory, tool use, and basic evals wired in. Code is yours to keep and extend.

Architecture decision guide

A documented framework for choosing agent architectures, orchestration patterns, and memory strategies for your specific use cases — usable by your team long after the workshop.

Eval starter kit

A ready-to-run eval harness with LLM-as-judge patterns, trajectory tests, and CI integration templates — so your team can measure agent quality from day one of production.

Pricing & format

In Person, Hands-On.

In-Person

On enquiry

pricing depends on location, travel & customisation

Duration 2 full days (9am–5pm) or can be split across 4 half-days

Group size Up to 30 — small enough for live Q&A and lab help

Format 70% hands-on labs · 30% concept and architecture review

Pre-work Brief call + setup guide sent 1 week before. Labs adapted to your stack.

Post-workshop 30 days async Q&A with facilitators included in every booking

Book this workshop

Who attends

Built for engineers who need to ship agents, not just understand them.

This is a technical workshop. Participants should be comfortable with Python and have used at least one LLM API. We skip the basics and go straight to production-grade patterns.

Backend engineers

Moving from API integrations to full agentic systems

AI/ML engineers

Taking agent prototypes to production-grade implementations

Full-stack developers

Building AI-native features into existing products

Platform engineers

Designing infrastructure for AI workloads at scale

Engineering managers

Understanding the real constraints and tradeoffs of agentic systems

Real-world context

Use cases we build
during the workshop.

Labs are structured around real business use cases — not toy demos. We adapt these to your industry in the pre-workshop brief.

Sales

Before you reach out.

Do we need prior AI experience to attend?

Participants should be comfortable with Python and have used at least one LLM API (OpenAI, Anthropic, or similar). We don't teach prompt engineering basics — we go straight to production patterns. Senior engineers new to AI but experienced in Python will do fine.

Which LLM provider do you use in the labs?

Labs are provider-agnostic by design — we show the same patterns on Anthropic Claude and OpenAI GPT-4o. If your team uses a specific model or hosts your own (Llama, Mistral, etc.), let us know in the pre-workshop brief and we'll adapt the examples.

What do participants need to set up before the workshop?

We send a pre-workshop setup guide 1 week in advance: Python environment, API keys for the LLM provider we'll use, and a few libraries installed. Setup takes about 30 minutes. If your team uses corporate machines with restrictions, we'll plan around that in the brief.

Is this relevant for our specific industry?

Yes — we adapt the use cases and lab exercises to your domain in the pre-workshop brief. BFSI, healthcare, SaaS, logistics, and manufacturing all have specific agent patterns worth teaching to. Generic demos waste everyone's time.

How current is the curriculum?

It's updated every quarter. The current version covers MCP (Model Context Protocol), LangGraph's latest state management APIs, multi-agent patterns from AutoGen v0.4+, and context compaction strategies — not what was best practice in 2023.

What happens if our team has follow-up questions after the workshop?

30 days of async Q&A with the facilitators is included in every booking. You can ask specific implementation questions, get architecture reviews, or request code walkthroughs as your team moves into production.

Agentic AI Programming.
For teams who need to ship it.

What you'll learn.

Agent Architecture Patterns

Tool Use & Function Calling at Depth

Memory Architecture

Context Window Management

Multi-Agent Orchestration

Structured Outputs & Guardrails

Eval Harness Design

Production Debugging & Observability

Three deliverables,
not just takeaways.

Working agent codebase

Architecture decision guide

Eval starter kit

In Person, Hands-On.

Built for engineers who need to ship agents, not just understand them.

Use cases we build
during the workshop.

Lead Qualification Agent

Bulk Invoice Parsing

Legal Knowledge Research

Before you reach out.

Do we need prior AI experience to attend?

Which LLM provider do you use in the labs?

What do participants need to set up before the workshop?

Is this relevant for our specific industry?

How current is the curriculum?

What happens if our team has follow-up questions after the workshop?

Ready to build agents
that ship to production?

Agentic AI Programming. For teams who need to ship it.

What you'll learn.

Agent Architecture Patterns

Tool Use & Function Calling at Depth

Memory Architecture

Context Window Management

Multi-Agent Orchestration

Structured Outputs & Guardrails

Eval Harness Design

Production Debugging & Observability

Three deliverables,not just takeaways.

Working agent codebase

Architecture decision guide

Eval starter kit

In Person, Hands-On.

Built for engineers who need to ship agents, not just understand them.

Use cases we buildduring the workshop.

Lead Qualification Agent

Bulk Invoice Parsing

Legal Knowledge Research

Before you reach out.

Do we need prior AI experience to attend?

Which LLM provider do you use in the labs?

What do participants need to set up before the workshop?

Is this relevant for our specific industry?

How current is the curriculum?

What happens if our team has follow-up questions after the workshop?

Ready to build agentsthat ship to production?

Agentic AI Programming.
For teams who need to ship it.

Three deliverables,
not just takeaways.

Use cases we build
during the workshop.

Ready to build agents
that ship to production?