Back to Catalog

Multi-Agent RAG Smart Document QA with Docling & LangGraph

IntermediateGuided Project

Build a multi-agent RAG document question-answering system using LangGraph workflows and Docling for document processing. Learn to extract content from PDFs with Docling, implement hybrid retrieval combining BM25 and vector search, and create specialized agents for relevance checking, research, and verification. Integrate with IBM WatsonX AI for embeddings and language models to generate accurate answers from documents. Master techniques for document chunking, caching, fact-checking responses, and handling complex questions through coordinated agent interactions.

Language

  • English

Topic

  • Artificial Intelligence

Skills You Will Learn

  • RAG, AI Agent, LLM, Generative AI, LangGraph, Docling

Offered By

  • IND

Estimated Effort

  • 60 minutes

Platform

  • SkillsNetwork

Last Update

  • May 5, 2025
About this Guided Project
Imagine having an AI assistant at your fingertips that can understand complex documents and answer your questions accurately without requiring you to read through hundreds of pages. What if you could simply upload a technical report, legal document, or research paper and get precise answers based on its content? This is the power of combining multi-agent RAG systems with document intelligence.

In this hands-on lab, you'll build a sophisticated question-answering system that makes document comprehension accessible to everyone in your organization—from researchers analyzing technical papers to legal teams extracting insights from contracts, all without the tedious manual review.

Project Overview

This lab teaches you to create an intelligent document processing system that handles the entire question-answering workflow:
1️⃣ Document Processing & Chunking - Extract text from PDFs and other formats, process into searchable chunks with caching for performance
2️⃣ Hybrid Retrieval - Combine keyword-based BM25 and semantic vector search for optimal document retrieval
3️⃣ Multi-Agent Verification - Use specialized agents for relevance checking, research, and fact verification
4️⃣ LangGraph Orchestration - Coordinate agent interactions with conditional workflows and feedback loops

By connecting specialized agents through LangGraph, you'll create a seamless experience where users can upload documents and get verified, accurate answers using IBM's Granite AI's powerful models.

What You'll Learn

By completing this lab, you will:
  • Design effective document processing pipelines with caching and deduplication
  • Build hybrid retrieval systems that balance keyword precision with semantic understanding
  • Create specialized AI agents for different stages of the question-answering process
  • Implement verification mechanisms to ensure factual accuracy
  • Orchestrate complex workflows with conditional branching and feedback loops

Who Should Do This Lab

This project is ideal for:
  • Developers looking to build practical document intelligence applications
  • Data scientists wanting to make document insights accessible to non-technical colleagues
  • AI enthusiasts interested in creating trustworthy information retrieval systems
No advanced ML expertise required—basic Python knowledge and curiosity about RAG applications are all you need.

What You Need

A browser to access the lab environment
Basic Python knowledge (understanding functions and data structures)
Basic LangChain and LLM knowledge
Sample documents (we provide examples, or bring your own PDFs)

By the end of this project, you'll have built an AI document assistant that transforms how people interact with information—enabling anyone to ask questions about complex documents and receive verified, accurate answers in seconds.

Instructors

Karan Goswami

Data Scientist

I am a dedicated Data Scientist and an AI enthusiast, currently working at IBM's Skills Builder Network. Learning how some simple mathematical operations could be used to make predictions and discover patterns sparked my curiosity, leading me to explore the exciting world of AI. Over the years, I’ve gained hands-on experience in building scalable AI solutions, fine-tuning models, and extracting meaningful insights from complex datasets. I'm driven by a desire to apply these skills to solve real-world problems and make a meaningful impact through AI.

Read more

Hailey Quach

Data Scientist

Hi, I'm Hailey. I enjoy teaching others to build creative and impactful AI projects. By day, I’m a Data Scientist at IBM; by night, an Honors BSc student at Concordia University in Montreal, always exploring new ways to combine learning with innovation.

Read more

Contributors

Joseph Santarcangelo

Senior Data Scientist at IBM

Joseph has a Ph.D. in Electrical Engineering, his research focused on using machine learning, signal processing, and computer vision to determine how videos impact human cognition. Joseph has been working for IBM since he completed his PhD.

Read more