Back to Catalog

Build a Baseball Data Analysis Agent w/ LangGraph

BeginnerGuided Project

Learn how to build an AI-powered baseball data analyst using LangGraph and Pandas that can analyze World Series statistics and answer questions using natural language. In this guided project, you’ll explore LangGraph and Pandas Agents to create a smart workflow that routes queries to the correct dataset, interprets structured data, and generates real-time insights. Perfect for beginners in data science, AI agents, or sports analytics looking to combine data reasoning with automation.

Language

  • English

Topic

  • Artificial Intelligence

Skills You Will Learn

  • Artificial Intelligence, LangGraph, LLMs, Python, Tool Calling

Offered By

  • IBMSkillsNetwork

Estimated Effort

  • 30 minutes

Platform

  • SkillsNetwork

Last Update

  • November 4, 2025
About this Guided Project
In this guided project, you’ll learn how to build an AI-powered World Series Baseball Data Assistant using LangGraph and LangChain Pandas Agents. You’ll design a smart, two-step agentic workflow that (1) routes a natural-language question to the correct dataset (World Series, playoffs, regular season, team, batting, or pitching) and (2) queries that data with a ReAct-style Pandas agent to produce clear, human-readable answers.

This project is perfect for learners who want to go beyond simple prompts and understand how to create agentic AI systems that reason over structured data. You’ll gain practical experience connecting LLM reasoning, keyword routing, robust error handling, and state management inside a LangGraph workflow, skills you can reuse for sports analytics, finance dashboards, or any data-driven assistant.

By the end of this project, you will be able to:
  • Understand the fundamentals of LangGraph and how it structures multi-step agentic workflows.
  • Build a compact route → query pipeline that translates natural-language questions into Pandas operations.
  • Create and configure Pandas DataFrame agents for multiple CSV datasets (World Series, playoffs, regular season, team, player batting, pitcher stats).
  • Implement a routing node that maps questions to the right dataset and a query node with guided retries and schema previews for stability.

What You'll Need

This project is designed for beginners to intermediate learners with a basic understanding of Python. No prior experience with LangGraph or agentic AI is required, everything will be explained step by step.

You’ll need:
  • Basic familiarity with Python programming and running Jupyter notebooks
  • An internet connection to install packages
All other libraries will be installed automatically within the lab environment.

Instructors

Malik Ali

Data Scientist Intern

Hey there, my name is Malik! I'm currently a data science intern at IBM looking to use the data science and machine learning concepts I'm learning to solve real-world problems. I have a Bachelor in Management Information Systems and am currently pursuing my Masters in Analytics at Georgia Tech.

Read more