Back to Catalog

Predicting Taxi Tip using Scikit-Learn and Snap ML

BeginnerGuided Project

Say you are driving a taxi, Uber, or Lift and want to figure out how much tip you would get for a ride. In this project, you will create an AI model to do just that. You will be using Scikit-learn and Snap ML, a high-performance IBM library for Machine Learning to build Decision Tree Regressor models.

4.5 (20 Reviews)

Language

  • English

Topic

  • Data Science

Enrollment Count

  • 96

Skills You Will Learn

  • Machine Learning, Python, Snap ML

Offered By

  • IBM

Estimated Effort

  • 30 minutes

Platform

  • SkillsNetwork

Last Update

  • April 29, 2024
About This Guided Project
About

Snap ML is a library for accelerated training and inference of Machine Learning models such as linear models, decision trees, random forests, and boosting machines. It's a library developed and maintained by IBM Research. The library binaries are freely available on PyPi. It has support for Linux/x86, Linux/Power, MacOS, Windows, Linux/Z. GPU support is also available for Linux. If you are curious, you can find detailed documentation here and usage examples here.

Snap ML provides highly-efficient CPU/GPU implementations of linear models and tree-based models. It not only accelerates Machine Learning algorithms through system awareness but also offers novel Machine Learning algorithms with best-in-class accuracy. In this guided project, we will focus on training acceleration in particular. You will consolidate your Machine Learning modeling skills by using a popular regression model: Decision Tree. You will use a real-world dataset to train such a model. 

You will find out that a Scikit-learn application can be seamlessly optimized by using Snap ML. The seamless integration of the Snap ML library is possible due to its Scikit-learn Python API compatibility.

A Look at the Project Ahead

After completing this guided project you will be able to:

  • Perform basic data preprocessing using Scikit-learn.
  • Model a regression task using Scikit-learn and Snap ML Python APIs.
  • Train a Decision Tree Regressor model using Scikit-learn and Snap ML.
  • Run inference and assess the quality of the trained models.

What You'll Need

To complete this guided project, you will need a basic understanding of the working mechanics of the Decision Tree models. You will also need some prior experience working with Scikit-learn APIs to be able to follow our data preprocessing steps easily.

This course mainly uses Python and JupyterLabs. Although these skills are recommended prerequisites, no prior experience is required as this Guided Project is designed for complete beginners.

Frequently Asked Questions


Do I need to install any software to participate in this project?
Everything you need to complete this project will be provided to you via the Skills Network Labs and it will all be available via a standard web browser.

What web browser should I use?
The Skills Network Labs platform works best with current versions of Chrome, Edge, Firefox, Internet Explorer, or Safari.

Instructors

Roxanne Li

Data Scientist at IBM

I am an aspiring Data Scientist at IBM with extensive theoretical/academic, research, and work experience in different areas of Machine Learning, including Classification, Clustering, Computer Vision, NLP, and Generative AI. I've exploited Machine Learning to build data products for the P&C insurance industry in the past. I also recently became an instructor of the Unsupervised Machine Learning course by IBM on Coursera!

Read more

Andreea Anghel

Staff Research Scientist

Researcher in machine learning

Read more

Contributors

Joseph Santarcangelo

Senior Data Scientist at IBM

Joseph has a Ph.D. in Electrical Engineering, his research focused on using machine learning, signal processing, and computer vision to determine how videos impact human cognition. Joseph has been working for IBM since he completed his PhD.

Read more