Building a Machine Learning Pipeline For NLP
Natural language processing (NLP) is a part of artificial intelligence concerned with understanding written text. Sentiment analysis is an important part of NLP that identifies the emotional tone behind a body of text and is used in customer reviews and survey responses, online and social media. In this project, you will determine the sentiment of movie reviews as positive, negative, and neutral with the rule-based method, then use Machine Learning. You will use pandas to load and analyze data and sklearn to process and classify the text and work with other libraries like NLTK.
4.4 (113 Reviews)

Language
- English
Topic
- Artificial Intelligence
Industries
- Financial Services, Healthcare, Government
Enrollment Count
- 692
Skills You Will Learn
- Natural Language Processing, Machine Learning, Python, NLP
Offered By
- IBM
Platform
- SkillsNetwork
Last Update
- May 12, 2025
Why you should do this Guided Project
A Look at the Project Ahead
- Understand Sentiment analysis
- Apply pandas to load,analyze and process your data
- Understand text preprocessing
- Understand the connection between rule-based methods and Machine Learning based methods
- Understand and Apply Bag-Of-Words and Term Frequency–Inverse Document Frequency to Sentiment analysis using
- Apply Hyperparameter using scikit-learn to NLP
- Apply Machine Learning pipeline using scikit-learn to NLP
What You'll Need

Language
- English
Topic
- Artificial Intelligence
Industries
- Financial Services, Healthcare, Government
Enrollment Count
- 692
Skills You Will Learn
- Natural Language Processing, Machine Learning, Python, NLP
Offered By
- IBM
Platform
- SkillsNetwork
Last Update
- May 12, 2025
Instructors
Joseph Santarcangelo
Senior Data Scientist at IBM
Joseph has a Ph.D. in Electrical Engineering, his research focused on using machine learning, signal processing, and computer vision to determine how videos impact human cognition. Joseph has been working for IBM since he completed his PhD.
Read moreContributors
Monireh Ebrahimi
Sr. Cognitive Software Developer
Monireh Ebrahimi is a Senior Cognitive Software Developer at IBM’s Center for Open-Source Data and AI Technologies (CODAIT) in San Francisco where she works on Open Source, Data & AI Technologies and she has been awarded “Outstanding Technical Award” in 2021 for her contributions to Text Extensions for Pandas. She has obtained her Ph.D. from Data Semantics (DaSe) lab at Kansas State University with a major focus on Neuro-Symbolic Integration. Her primary research interests include Deep Learning, Knowledge Graphs, Reasoning, Semantic Web, and Natural Language Processing. She is also really interested in applying NLP and Data Science in real world applications and get the chance to work with customers and partners in various industries and help them in their Data Science journey. Monireh has over 15 peer-reviewed publications and three patents and served as a PC member or reviewer for Artificial Intelligence (NeurIPS, AAAI, IJCAI, ICML, ICLR, JAIR), and Semantic Web conferences (ISWC, ESWC, TheWebCon) and received Most Outstanding Reviewer Award from WWW 2017.
Read more