Back to Catalog

Mastering Generative AI: LLM Architecture & Data Preparation

Learn on

edX logo
IntermediateCourse

Build in-demand, job-ready generative AI architecture and data science skills in less than a month. No programming experience is required.

Language

  • English

Topic

  • Artificial Intelligence

Skills You Will Learn

  • AI, Hugging Face, NLP, Large Language Models, PyTorch

Offered By

  • IBMSkillsNetwork

Estimated Effort

  • 2 Weeks 4 hrs

Platform

  • edX

Last Update

  • March 28, 2025
About this Course
This course teaches you the skills you need to get your first job as a generative AI architect or data scientist.  
You will learn: 
  • Job-ready generative AI architecture and data science skills in less than a month, plus practical experience and an industry-recognized credential that employers value.  
  • Difference between generative AI architectures and models, such as RNNs, transformers, VAEs, GANs, and diffusion models. 
  • Use of LLMs, such as GPT, BERT, BART, and T5 in language processing. 
  • Implementation of tokenization to preprocess raw textual data using NLP libraries such as NLTK, spaCy, BertTokenizer, and XLNetTokenizer. 
  • Creation of an NLP data loader using PyTorch to perform tokenization, numericalization, and padding of text data. 
Course Overview 
The demand for generative AI is expected to grow at 46.47% annually, resulting in a market volume of US$356bn by 2030 (Source: Statista, Feb 2024). The expansion of generative AI across industries emphasizes its potential for aspiring data scientists, machine learning engineers, and AI developers.  
 
This course focuses on using large language models (LLMs), natural language processing (NLP), and general artificial intelligence (AI) skills that organizations desire. 
 
In this course, you will learn about the types of generative AI and its real-world applications. You will gain knowledge of various generative AI architectures and models, such as Recurrent Neural Networks (RNNs), Transformers, Generative Adversarial Networks (GANs), Variational AutoEncoders (VAEs), and Diffusion Models. You will learn the differences in the training approaches used for each model. You will be able to explain the use of LLMs, such as Generative Pre-Trained Transformers (GPT) and Bidirectional Encoder Representations from Transformers (BERT).  
 
Furthermore, you will learn about the tokenization process, tokenization methods, and the use of tokenizers for word-based, character-based, and subword-based tokenization. You will gain knowledge to use data loaders for training generative AI models, list the PyTorch libraries for preparing and handling data within data loaders, and use the generative AI libraries in Hugging Face. The course will also prepare you to implement tokenization and create an NLP data loader. 
 
Throughout this short self-paced course, you will be presented with instructional guidance through videos followed by hands-on labs to practice what you learn. You will also complete a final project to showcase your generative AI architecture or data preparation skills.   
  
If you’re looking to build a rewarding career in generative AI architecture or data science domain, this IBM Generative AI Architecture and Data Preparation for LLMs course will get you job-ready and give you the skills you need for rewarding career opportunities.  
 

Course Syllabus

Module 1: Generative AI Architecture 
  • Video: Overview of AI Engineering with LLMs Specialization 
  • Video: Course Introduction 
  • Reading: Course Overview 
  • Reading: Helpful Tips for Course Completion 
  • Video: Significance of Generative AI 
  • Video: Generative AI Architectures and Models 
  • Video: Generative AI for NLP 
  • Reading: Basics of AI Hallucinations 
  • Reading: Overview of Libraries and Tools 
  • Lab: Exploring Generative AI Libraries 
  • Reading: Summary and Highlights 
  • Practice Quiz: Generative AI Overview and Architecture 
  • Graded Quiz: Generative AI Architecture 
 
Module 2: Data Preparation for LLMs 
  • Video: Tokenization 
  • Lab: Implementing Tokenization 
  • Video: Overview of Data Loaders 
  • Lab: Creating an NLP Data Loader 
  • Reading: Summary and Highlights 
  • Practice Quiz: Preparing Data 
  • Graded Quiz: Data Preparation for LLMs 
  • Cheat Sheet: Generative AI and LLMs: Architecture and Data Preparation 
  • Course Glossary: Generative AI and LLMs: Architecture and Data Preparation 

Recommended Skills Prior to Taking this Course

For this course, a basic knowledge of Python and PyTorch and an awareness of machine learning and neural networks would be an advantage, though not strictly required. 

Instructors

IBM Skills Network Team

Administrator

IBM Skills Network

Read more