Generative AI Engineering with Fine Tuning Transformers
Learn on
This course provides you with an overview of how to use transformer-based models for natural language processing (NLP). In this course, you will learn to apply transformer-based models for text classification, focusing on the encoder component. You’ll learn about positional encoding, word embedding, and attention mechanisms in language transformers and their role in capturing contextual information and dependencies. Additionally, you will be introduced to multi-head attention and gain insights on decoder-based language modeling with generative pre-trained transformers (GPT) for language
4.5 (97 Reviews)

Language
- English
Topic
- Artificial Intelligence
Industries
- Artificial Intelligence
Enrollment Count
- 17.09K
Skills You Will Learn
- Fine-tuning LLMs, LoRA And QLoRA, PyTorch, Hugging Face, Pre-trainingTransformers
Offered By
- IBMSkillsNetwork
Estimated Effort
- 3 weeks, 2 hrs
Platform
- Coursera
Last Update
- December 6, 2025
- Apply positional encoding and attention mechanisms in transformer-based architectures to process sequential data.
- Use transformers for text classification.
- Use and implement decoder-based models, such as GPT, and encoder-based models, such as BERT, for language modeling.
- Implement a transformer model to translate text from one language to another.

Language
- English
Topic
- Artificial Intelligence
Industries
- Artificial Intelligence
Enrollment Count
- 17.09K
Skills You Will Learn
- Fine-tuning LLMs, LoRA And QLoRA, PyTorch, Hugging Face, Pre-trainingTransformers
Offered By
- IBMSkillsNetwork
Estimated Effort
- 3 weeks, 2 hrs
Platform
- Coursera
Last Update
- December 6, 2025
Instructors
Joseph Santarcangelo
Senior Data Scientist at IBM
Joseph has a Ph.D. in Electrical Engineering, his research focused on using machine learning, signal processing, and computer vision to determine how videos impact human cognition. Joseph has been working for IBM since he completed his PhD.
Read moreFateme Akbari
Data Scientist @IBM
I'm a data-driven Ph.D. Candidate at McMaster University and a data scientist at IBM, specializing in machine learning (ML) and natural language processing (NLP). My research focuses on the application of ML in healthcare, and I have a strong record of publications that reflect my commitment to advancing this field. I thrive on tackling complex challenges and developing innovative, ML-based solutions that can make a meaningful impact—not only for humans but for all living beings. Outside of my research, I enjoy exploring nature through trekking and biking, and I love catching ball games.
Read moreKang Wang
Data Scientist
I was a Data Scientist in the IBM. I also hold a PhD from the University of Waterloo.
Read moreAshutosh Sagar
Data Scientist
I am currently a Data Scientist at IBM with a Master’s degree in Computer Science from Dalhousie University. I specialize in natural language processing, particularly in semantic similarity search, and have a strong background in working with advanced AI models and technologies.
Read moreContributors
Wojciech "Victor" Fulmyk
Data Scientist at IBM
Wojciech "Victor" Fulmyk is a Data Scientist and AI Engineer on IBM’s Skills Network team, where he focuses on helping learners build expertise in data science, artificial intelligence, and machine learning. He is also a Kaggle competition expert, currently ranked in the top 3% globally among competition participants. An economist by training, he applies his knowledge of statistics and econometrics to bring a distinctive perspective to AI and ML—one that considers both technical depth and broader socioeconomic implications.
Read more