Back to Catalog

Python for Data Analysis

Premium
Intermediatecourse

Get started with Python and build essential skills for data analysis in just 5 weeks—no prior programming experience is required.

Language

  • English

Topic

  • Data Analysis

Industries

  • Data Analysis

Skills You Will Learn

  • Pandas Python Package, Data Analysis, Machine Learning Libraries, Python Programming Language, Machine Learning

Offered By

  • IBMSkillsNetwork

Estimated Effort

  • 6 Weeks

Platform

  • SkillsNetwork

Last Update

  • October 16, 2024
About this course
Python is the most popular programming language used in data science (Statista). So, if you’re keen to kickstart your career in data analytics, get ready to dive into the world of data with Python! This course is your ticket to mastering data analysis and building data models using Python. It’s the perfect introduction to this in-demand language for aspiring data scientists and analysts.  
  
During the course, which is also part of the IBM Data Analyst Professional Certificate, you’ll master importing data from various sources and learn how to clean and prepare it for exploratory data analysis (EDA) and eye-catching visualizations. You'll also predict future trends by creating linear, multiple, and polynomial regression models and pipelines and understand how to evaluate them like a pro.  
  
As you progress through the modules, you’ll learn to:  
  • Collect and import data  
  • Clean, prep, and format data  
  • Manipulate data frames  
  • Summarize data  
  • Build machine learning regression models  
  • Refine your models  
  • Create data pipelines.  
You’ll also build your practical understanding of Python through hands-on labs where you'll explore open-source Python libraries like Pandas and NumPy for data manipulation and visualization. Plus, you'll gain proficiency in using SciPy and scikit-learn to build machine-learning models and make predictions.  
  
Enroll today and build job-ready skills in one of the world’s most popular programming languages in just 5 weeks! 

Course Syllabus

Module 1 - Importing Datasets 
  • Learning Objectives 
  • Understanding the Domain 
  • Understanding the Dataset 
  • Python package for data science 
  • Importing and Exporting Data in Python 
  • Basic Insights from Datasets 
Module 2 - Cleaning and Preparing the Data 
  • Identify and Handle Missing Values 
  • Data Formatting 
  • Data Normalization Sets 
  • Binning 
  • Indicator variables 
Module 3 - Summarizing the Data Frame 
  • Descriptive Statistics 
  • Basic of Grouping 
  • ANOVA 
  • Correlation 
  • More on Correlation 
Module 4 - Model Development 
  • Simple and Multiple Linear Regression 
  • Model Evaluation Using Visualization 
  • Polynomial Regression and Pipelines 
  • R-squared and MSE for In-Sample Evaluation 
  • Prediction and Decision Making 
Module 5 - Model Evaluation 
  • Model Evaluation 
  • Over-fitting, Under-fitting and Model Selection 
  • Ridge Regression 
  • Grid Search 
  • Model Refinement

Learning Objectives

  • Import, clean, and prepare data for analysis. 
  • Use Pandas, DataFrames, NumPy, and SciPy. 
  • Load, manipulate, analyze, and visualize data. 
  • Build machine learning models  

Recommended Skills Prior to Taking this Course

This is a beginner-friendly introduction to data analysis, therefore no prior programming experience is necessary. However, basic knowledge of using a computer, navigating files and folders, and using basic software applications is recommended.   

Instructors

Joseph Santarcangelo

Senior Data Scientist at IBM

Joseph has a Ph.D. in Electrical Engineering, his research focused on using machine learning, signal processing, and computer vision to determine how videos impact human cognition. Joseph has been working for IBM since he completed his PhD.

Read more