Back to Catalog

IBM Data Analyst Capstone Project

Learn on

Coursera logo
AdvancedCourse

This course provides a capstone project to apply Data Analytics skills and techniques in a professional environment. You will work with real datasets to perform data collection, wrangling, analysis, visualization, and create interactive dashboards. Using tools like Jupyter Notebooks, SQL, IBM Cognos Analytics, and Python libraries (Pandas, Numpy, Scikit-learn, etc.), you will deliver a comprehensive analysis report for stakeholders. Completing prior courses in the certificate is recommended before starting this capstone project.

4.7 (1k+ Reviews)

Language

  • English

Topic

  • Data Analysis

Industries

  • Information Technology

Enrollment Count

  • 70.17K

Skills You Will Learn

  • Data Analysis, Exploratory Data Analysis, Dashboard Creation, Data Wrangling, Data Collection

Offered By

  • IBMSkillsNetwork

Estimated Effort

  • 6 weeks

Platform

  • Coursera

Last Update

  • April 25, 2025
About this Course
In an increasingly data-centric world, the ability to derive meaningful insights from raw data is essential. The IBM Data Analyst Capstone Project gives you the opportunity to apply the skills and techniques learned throughout the IBM Data Analyst Professional Certificate. Working with actual datasets, you will carry out tasks commonly performed by professional data analysts, such as data collection from multiple sources, data wrangling, exploratory analysis, statistical analysis, data visualization, and creating interactive dashboards. Your final deliverable will include a comprehensive data analysis report, complete with an executive summary, detailed insights, and a conclusion for organizational stakeholders. 

Throughout the project, you will demonstrate your proficiency in tools such as Jupyter Notebooks, SQL, Relational Databases (RDBMS), and Business Intelligence (BI) tools like IBM Cognos Analytics. You will also apply Python libraries, including Pandas, Numpy, Scikit-learn, Scipy, Matplotlib, and Seaborn.  

We recommend completing the previous courses in the Professional Certificate before starting this capstone project, as it integrates all key concepts and techniques into a single, real-world scenario.

Learning Objectives

  • Apply techniques to gather and wrangle data from multiple sources.
  • Analyze data to identify patterns, trends, and insights through exploratory techniques.
  • Create visual representations of data using Python libraries to communicate findings effectively.
  • Construct interactive dashboards with BI tools to present and explore data dynamically.

Course Syllabus

Module 1: Welcome to the Course
  • Lesson 0: Welcome
  • Lesson 1: Collecting Data Using APIs
  • Lesson 2: Collecting Data Using Web Scraping
  • Lesson 3: Exploring Data
Module 2: Data Wrangling
  • Lesson 1: Assignment Overview
  • Lesson 2: Finding Duplicates
  • Lesson 3: Removing Duplicates
  • Lesson 4: Finding Missing Values
  • Lesson 5: Imputing Missing Values
  • Lesson 6: Normalizing Data
Module 3: Exploratory Data Analysis
  • Lesson 1: Assignment Overview
  • Lesson 2: Analyzing the Data Distribution
  • Lesson 3: Handling Outliers
  • Lesson 4: Correlation
Module 4: Data Visualization
  • Lesson 1: Assignment Overview
  • Lesson 2: Visualizing Distribution of Data
  • Lesson 3: Visualizing Relationship
  • Lesson 4: Visualizing Composition of Data
  • Lesson 5: Visualizing Comparison of Data
Module 5: Building a Dashboard
  • Lesson 1: Assignment Overview
  • Lesson 2: Dashboards
Module 6: Final Assignment: Present Your Findings
  • Lesson 1: How to Present Your Findings
  • Lesson 2: Final Presentation
  • Lesson 3: Course Wrap Up

Course Prerequisites

This course requires you to be proficient in data analysis and have experience using SQL, Relational Databases, performing data collection, data wrangling, data analysis, & data visualization with Python libraries, and using a BI tool like IBM Cognos Analytics or Google Looker. It is recommended that you complete all prior courses in the  IBM Data Analyst Professional Certificate before starting this project course
  • Introduction to Data Analytics
  • Excel Basics for Data Analysis
  • Data Visualization and Dashboards with Excel and Cognos
  • Python for Data Science, AI & Development
  • Python Project for Data Science
  • Databases and SQL for Data Science with Python
  • Data Analysis with Python
  • Data Visualization with Python

Instructors

Rav Ahuja

Global Program Director, IBM Skills Network

Rav Ahuja is a Global Program Director at IBM. He leads growth strategy, curriculum creation, and partner programs for the IBM Skills Network. Rav co-founded Cognitive Class, an IBM led initiative to democratize skills for in demand technologies. He is based out of the IBM Canada Lab in Toronto and specializes in instructional solutions for AI, Data, Software Engineering and Cloud. Rav presents at events worldwide and has authored numerous papers, articles, books and courses on subjects in managing and analyzing data. Rav holds B. Eng. from McGill University and MBA from University of Western Ontario.

Read more

Ramesh Sannareddy

Corporate IT Trainer

Ramesh Sannareddy holds a Bachelors Degree in Information Systems (Birla Institute of Technology, Pilani). He has two and a half decades of experience in Information Technology Infrastructure Management, Database Administration, Information Integration and Automation. He worked for companies like Intergraph, Genpact, HCL, and Microsoft. Currently, he is a freelancer and pursues his passion for teaching. He teaches Data Science, Machine Learning, Programming and Databases.

Read more

Raghul Ramesh

SME

Artificial Intelligence , Big Data , Cloud Architect, Have more than 17 years of experience in working with banking, finance, retail, ecommerce, pharma, ecommerce domain projects,

Read more