Back to Catalog

Data Engineering Project with Python

Premium
Intermediatecourse

Practice crucial data engineering skills by implementing an ETL pipeline with Python. Extract data from different sources, transform it for analysis, and load it into a database.

Language

  • English

Topic

  • Python

Skills You Will Learn

  • Python, Web Scraping, Data Engineering, Application Programming Interface (API), ETL, RDBMS

Offered By

  • IBMSkillsNetwork

Estimated Effort

  • 9 hours

Platform

  • SkillsNetwork

Last Update

  • October 25, 2024
About this course
You'll gain hands-on experience with critical data engineering tasks in Data Engineering Project with Python. In this project you will implement an Extract, Transform, Load (ETL) data pipeline by creating a Python script.  
 
You will learn to work with an IDE similar to VS Code to create your Python program and execute your code in a Linux terminal window.  
 
As a part of the project you will collect data from APIs and web scraping, extract data from diverse file formats, transform it for further analysis, and load it into a relational database. You'll also log data operations to ensure transparency and reproducibility.  
 
Enhance your collaborative skills by sharing your project and participating in peer reviews to gain and provide constructive feedback. This module will equip you with the practical knowledge and skills to handle real-world data engineering challenges efficiently. 

What you will learn: 

After completing this course, you will be able to: 
  • Use Python and an IDE to perform ETL tasks. 
  • Utilize APIs and web scraping to perform data extraction  
  • Transform data for analysis 
  • Log operations and prepare data for loading  

Course Syllabus

Module 1 - Extract, Transform, Load (ETL) 
  • Getting Started with IDE 
  • Extract, Transform, Load (ETL) 
  • Web scraping 
  • REST APIs and HTTP Requests 
  • Web scraping and Extracting Data using APIs 
  • Querying SQLite3 database 
Module 2 – Final Project 
  • Practice Project: Extract, Transform, and Load GDP Data 
  • Final Project: Acquiring and Processing Information on World's Largest Banks 

General Information

  • This course is self-paced. 
  • This platform works best with current versions of Chrome, Edge, Firefox, Internet Explorer, or Safari. 

Recommended Skills Prior to Taking this Course

  • Working knowledge of the Python programming language (It is highly recommended that you complete the Python Fundamentals course from IBM prior to starting this project). 

Instructors

Joseph Santarcangelo

Senior Data Scientist at IBM

Joseph has a Ph.D. in Electrical Engineering, his research focused on using machine learning, signal processing, and computer vision to determine how videos impact human cognition. Joseph has been working for IBM since he completed his PhD.

Read more