Decision Trees in Machine Learning: Two Types (+ Examples)

Written by Coursera Staff • Updated on

Decision trees are a supervised learning algorithm often used in machine learning. Explore what decision trees are and how you might use them in practice.

[Featured image] A machine learning engineer sits in his home office thinking about how to use decision trees in machine learning.

Key takeaways

In machine learning, you can use two types of decision trees: classification trees and regression trees.

  • The tree-like structure for which a decision tree is named can help you visualize a machine learning model and adjust its training.

  • Decision trees are effective for decision-making because they lay out your problem and possible solutions, making it easy to analyze potential consequences.

  • You can use a decision tree to create classification and regression models for classification tasks, such as determining whether an event did or didn’t happen, and for prediction tasks that estimate what is likely to happen.

Explore what decision trees are, their relevance in machine learning, and common examples to start building your foundation in this field. If you’re ready to develop your machine learning skill set, enroll in the Machine Learning Specialization from Stanford University and DeepLearning.AI, in which you can expand your abilities in areas such as applied machine learning, model training, TensorFlow, and more. In as little as two months, you can earn a career credential and add it to your LinkedIn profile.

What is a decision tree in machine learning? 

In machine learning, a decision tree is a supervised learning algorithm used for classification and regression modeling. Regression is a method used for predictive modeling, so these trees are used to either classify data or predict what will happen next. 

Decision trees look like flowcharts, and are so named because they start at the root, like an upside-down tree, and branch off to demonstrate various outcomes. Because machine learning is based on solving problems, decision trees help us visualize these models and adjust how we train them.

They start at the root node with a specific question of data, which leads to branches that hold potential answers. The branches then lead to decision (internal) nodes, which ask more questions, leading to more outcomes. This continues until the data reaches a terminal (or “leaf”) node and ends.

In machine learning, you can choose between four main training algorithms: supervised, unsupervised, reinforcement, and semi-supervised learning. A decision tree helps us visualize how a supervised learning algorithm leads to specific outcomes.

For a more detailed look at decision trees, watch this video:

Learn more: How Does Machine Learning Work?

Introduction to supervised learning

If you want to deepen your knowledge of supervised learning, consider the Supervised Machine Learning: Regression and Classification course from DeepLearning.AI and Stanford University. In as little as three weeks, you’ll get an introduction to modern machine learning, including supervised learning and algorithms such as decision trees, multiple linear regression, neural networks, and logistic regression.

Why a decision tree is used in machine learning

Decision trees in machine learning provide an effective decision-making method because they lay out the problem and all the possible outcomes. This enables developers to analyze the possible consequences of a decision, and as an algorithm accesses more data, it can predict outcomes for future data.

In this simple decision tree, the question of whether or not to go to the supermarket to buy toilet paper is analyzed:

[Image] A decision tree describes the process of buying toilet paper.

In machine learning, decision trees offer simplicity and a visual representation of the possibilities when formulating outcomes. Below, we will explain how the two types of decision trees work. 

Types of decision trees in machine learning models 

Decision trees in machine learning can either be classification trees or regression trees. Together, both algorithms fall into a category of “classification and regression trees” and are sometimes called CART. Their respective roles are to “classify” and to “predict.”

1. Classification trees

Classification trees determine whether an event happened or didn’t happen. Usually, this involves a “yes” or “no” outcome. 

We often use this type of decision-making in the real world. Take a look at a few examples to help you contextualize how decision trees work for classification:

Example 1: How to spend your free time after work

What you do after work in your free time can depend on the weather. If it is sunny, you might choose to picnic with a friend, grab a drink with a colleague, or run errands. If it is raining, you might stay home and watch a movie instead. In this scenario, you have a clear outcome. In this case, that is classified as whether to “go out” or “stay in.”

Example 2: Homeownership based on age and income

In a classification tree, the data set splits according to its variables. In this scenario, you have two variables, age and income, determining whether someone buys a house. If training data tells us that 70 percent of people over age 30 bought a house, then the data gets split there, with age becoming the first node in the tree. This split makes the data 80 percent “pure.” The second node then addresses income from there.

Gain hands-on experience with classification trees

If you want to get started on understanding how decision trees work in machine learning, consider registering for these guided projects to apply your skills to real-world projects. You can complete them in two hours:

Scikit-Learn to Solve Regression Machine Learning Problems

Decision Tree Classifier for Beginners in R

2. Regression trees

Regression trees, on the other hand, predict continuous values based on previous data or information sources. For example, they can predict the price of gasoline or whether a customer will purchase eggs (including which type of eggs and at which store).

This type of decision-making involves programming algorithms to predict what is likely to happen, given previous behavior or trends. 

Example 1: Housing prices in Colorado

Regression analysis could be used to predict the price of a house in Colorado, which is plotted on a graph. The regression model can predict housing prices in the coming years using data points from previous years' prices. This relationship is a linear regression since housing prices are expected to continue rising. Machine learning helps us predict specific prices based on a series of variables that have been true in the past.

Example 2: Bachelor’s degree graduates in 2025

A regression tree can help a university predict how many bachelor’s degree students there will be in 2025. On a graph, one can plot the number of degree-holding students between 2010 and 2022. If the number of university graduates increases linearly each year, then regression analysis can be used to build an algorithm that predicts the number of students in 2025. 

A classification and regression tree (CART) is a predictive algorithm used in machine learning that generates future predictions based on previous values. These decision trees are at the core of machine learning and serve as a basis for other machine learning algorithms, such as random forest, bagged decision trees, and boosted decision trees.

Gain hands-on experience with regression trees

To get started on how decision tree algorithms work in predictive machine learning models, take a look at these courses. Each one takes around three weeks and is based on real-world examples, so you can elevate your skills:

Supervised Machine Learning: Regression and Classification

Regression Analysis: Simplify Complex Data Relationships

Machine learning decision tree terminology

These terms come up frequently in machine learning and are helpful to know as you embark on your machine learning journey:

  • Root node: The topmost node of a decision tree that represents the entire message or decision

  • Decision (or internal) node: A node within a decision tree where the prior node branches into two or more variables

  • Leaf (or terminal) node: The leaf node is also called the external node or terminal node, which means it has no child. It’s the last node in the decision tree and furthest from the root node

  • Splitting: The process of dividing a node into two or more nodes. It’s the part at which the decision branches off into variables

  • Pruning: The opposite of splitting, the process of going through and reducing the tree to only the most important nodes or outcomes

Explore our free machine learning resources

Join Career Chat on LinkedIn to get timely updates on popular skills, tools, and certifications in machine learning. Then, build or refresh your skills with our other free resources:

Accelerate your career growth with a Coursera Plus subscription. When you enroll in either the monthly or annual option, you’ll get access to over 10,000 courses. You can learn and earn credentials at your own pace from over 350 leading companies and universities.

Updated on
Written by:

Editorial Team

Coursera’s editorial team is comprised of highly experienced professional editors, writers, and fact...

This content has been made available for informational purposes only. Learners are advised to conduct additional research to ensure that courses and other credentials pursued meet their personal, professional, and financial goals.