AI & Machine LearningπŸ’» Technical CourseLearnAspire Certified

Production RAG with LangGraph: State Machines, Routing, and Observability for AI Engineers

β€œYour RAG Chain Works in Dev. Now Make It Survive Production.”

Replace a brittle LangChain RAG chain with a LangGraph workflow that routes queries by intent, reranks with Cohere, enforces token budgets, and emits auditable LangSmith traces.

Intermediate12h6 modules51 slides18 exercises24 quiz Qs
πŸ”₯ Launch Price β€” 63% off. Limited time.
β‚Ή2,999β‚Ή7,999

One-time Β· Lifetime access Β· Certificate included

Sign in to Enroll
7-day money-back guarantee
  • βœ“6 modules of content
  • βœ“51 concept slides
  • βœ“18 practical exercises
  • βœ“24 quiz questions
  • βœ“Capstone project
  • βœ“LearnAspire certificate

Learning Outcomes

What you'll learn

β†’You will be able to define a typed LangGraph StateGraph with a ClarifindQueryState TypedDict, wire named nodes for classification, retrieval, reranking, and generation, and execute the graph end-to-end against a live Pinecone serverless index using LangGraph 0.2.28
β†’You will be able to implement a three-branch query intent router in LangGraph that classifies incoming queries as semantic, keyword, or direct-lookup and dispatches each to the correct retriever node β€” eliminating the single-retriever bottleneck that causes 30% irrelevant results in Clarifin's current prototype
β†’You will be able to instrument every node in a LangGraph workflow with LangSmith SDK 0.1.x run contexts so that any failed retrieval can be traced to the exact graph node, branch decision, and document chunk β€” producing a trace URL a compliance officer can open and audit
β†’You will be able to attach a per-query token budget guard to a LangGraph workflow using tiktoken 0.7 that short-circuits expensive reranking and generation paths when accumulated token cost exceeds $0.004, and write a pytest 8.1 test that asserts the guard fires on a known over-budget fixture
β†’You will be able to evaluate a LangGraph RAG workflow against Clarifin's three production SLAs β€” p95 latency under 3 seconds, cost under $0.004 per query, and 100% chunk traceability β€” produce a benchmark report naming which nodes breach each SLA and justify a specific architectural change to fix each breach

The day after you finish

The day after completing this course, you will open Clarifin's existing RetrievalQA chain, replace it with a LangGraph StateGraph that has a typed ClarifindQueryState, a classify_intent node, three retriever branches (Pinecone dense, BM25 sparse, direct metadata lookup), a Cohere Rerank v3 node, a token budget guard, and LangSmith run logging on every node β€” then push it to a feature branch and share the LangSmith project URL with your tech lead as proof the workflow executed and every branch decision is visible.

Who this is for

  • Primary: ML Engineer or Senior AI Engineer with 2–4 years of experience who has shipped a RAG prototype and is now responsible for making it production-grade under latency, cost, and audit constraints
  • Secondary: Platform or MLOps Engineer responsible for deploying and operating RAG pipelines at scale who needs to add observability, fallback logic, and cost governance without rewriting the retrieval core
  • Tertiary: Tech Lead or Staff Engineer evaluating whether a RAG prototype is ready for GA β€” this course gives them the vocabulary, patterns, and failure modes to make that call

Prerequisites

  • Working knowledge of a basic RAG pipeline: you have personally built query β†’ embed β†’ retrieve β†’ generate in LangChain or equivalent β€” not watched someone else do it
  • Comfortable reading and writing Python 3.10+ including TypedDict, dataclasses, async/await, and pytest fixtures
  • Hands-on experience with at least one vector DB (Pinecone, Weaviate, or Chroma) and the OpenAI API β€” you know what an index namespace is and have called client.query() before
  • No prior LangGraph knowledge required β€” but you must already find LangChain's linear chain pattern limiting in practice

Curriculum

6 modules Β· full breakdown

πŸ€– Part of: AI Engineering Path

Step 1 β€” Foundations
β†’Step 2 β€” Core Skills
β†’Step 3 β€” RAG
β†’Step 4 β€” LangGraph RAG
β†’Step 5 β€” Agent Systems
β†’Step 6 β€” Production
β†’Step 7 β€” MCP
β†’Step 8 β€” Enterprise
← Previous: Step 3 β€” RAGNext in path: Step 5 β€” Agent Systems β†’
πŸ†

Capstone Project

Clarifin Compliance RAG: Production LangGraph Module with Routing, Reranking, Cost Guards, and Audit Traces

Build and deliver a fully self-contained Python package β€” clarifin_rag/ β€” that implements a LangGraph 0.2.28 StateGraph with typed ClarifindQueryState, five named nodes (classify_intent, dense_retrieve, sparse_retrieve, cohere_rerank, guarded_generate), a conditional edge router, a tiktoken-based token budget guard enforcing the $0.004 per-query ceiling, and LangSmith SDK instrumentation on every node. The package includes a pytest 8.1 test suite with fixtures covering the happy path, the BM25 fallback trigger, and the cost-guard short-circuit. A benchmark script runs 50 queries from a provided regulatory fixture set against a live Pinecone serverless index and outputs a cost-per-query CSV and p95 latency measurement. A one-page deployment guide documents environment variables, Pinecone index schema, and the LangSmith project configuration needed to reproduce the audit dashboard.

What you'll deliver

A GitHub-pushable repository containing: (1) clarifin_rag/ Python package with fully typed LangGraph StateGraph and all five node implementations, (2) tests/ directory with pytest 8.1 suite covering three critical paths, (3) benchmark/results.csv showing cost-per-query and latency across 50 regulatory queries, (4) a LangSmith dashboard screenshot with query routing decisions visible across at least three branch types, and (5) DEPLOYMENT.md with environment setup, index schema, and audit configuration instructions