What Is Bayesian Statistics?

Written by Coursera Staff • Updated on

Learn the fundamentals of Bayesian statistics and how professionals across industries are utilizing this method. Plus, take your first steps into this field by reviewing a real-world example of Bayes’ theorem in use.

[Featured Image] Two work colleagues use a laptop to analyze work related bayesian statistics
l

Bayesian statistics is a branch of statistics based on Bayes’ theorem, which provides a framework to update probabilities and predictions as new evidence or additional data becomes available. You can use Bayesian statistics across many exciting fields, including health care policy, machine learning, finance, and marketing. Explore the fundamentals of Bayesian statistics, how they differ from frequentist statistics, and the history, types, common uses, and advantages and disadvantages of using it.

Understanding statistics 

When analyzing data, you can take two main statistical approaches: descriptive and inferential. In descriptive statistics, you describe your data set without making any conclusions. For example, you might say your sample comprises “40 percent men and 60 percent women.” This gives your audience an idea about your sample but doesn’t draw any inferences from it.

In inferential statistics, you perform statistical tests to test your hypotheses. If you were looking at coffee consumption in your sample, you might find that “women are 25 percent more likely to drink coffee in the morning than men.” This gives you actionable data that you can make decisions with. If you were running a coffee company, you might use this information to target your marketing campaign more toward women. 

Bayesian vs frequentist statistics

When employing inferential statistics, you can choose between two approaches: Bayesian statistics and frequentist statistics. Frequentist statistics focus on the data at hand, centering on the belief that your conclusions should derive from the information you have. On the other hand, Bayesian statistics allow for the incorporation of prior knowledge and prior probabilities into your analysis to create an updated conclusion.

Frequentist statistics

Frequentist statistics define probability as the likelihood of an event to occur over a long period of time. Frequentist methods do not consider prior knowledge and instead focus solely on the likelihood of a certain hypothesis being true. In this approach, you treat parameters (e.g., population mean or other whole population measures) as fixed, and you draw conclusions using methods like probability values (p-values) and confidence intervals. Frequentist inference is typically more objective but does not account for prior knowledge or uncertainty in parameter estimates. It’s worth noting that frequentist statistics are the default approach for many current analytical methods and statistical software.

Bayesian statistics

Bayesian statistics, on the other hand, consider your prior knowledge and allow you to update the parameters based on this new information. Prior probability is the previously defined likelihood of an event based on what you believed before new information was available. By integrating new data with prior knowledge, you obtain a posterior probability, the revised probability of a certain event occurring. This approach allows you to create a dynamic updating process as new data becomes available, incorporating both subjective knowledge and objective evidence.

History of Bayesian statistics

Bayesian statistics are named after Thomas Bayes, an eighteenth-century mathematician who developed Bayes’ theorem to calculate conditional probabilities (probabilities based on previous outcomes). Bayes wrote his findings in “Essay Towards Solving a Problem in Doctrine of Chances” in 1763, which was published in Philosophical Transactions of the Royal Society following his passing [1]. 

Pierre-Simon Laplace continued this work in the late 1700s and early 1800s, developing Bayes’ ideas and applying them to practical problems [2]. Bayesian statistics have continued to play an important role in mathematics and computation over the last few decades, especially with the development of Markov Chain Monte Carlo (MCMC) models. These models generate simulated probability distributions, which are important in predicting probable outcomes, often validated by machine learning methods.

Types of Bayesian statistics

Bayesian statistics include a range of approaches that rely on new data to aid in updating current probabilities. Bayes’ theorem, a conditional probability equation, is the foundation for each approach. Some types of Bayesian statistics include:

  • Bayesian inference: Updating prior beliefs with new evidence to form a posterior distribution, which is an updated view of how likely each outcome is

  • Bayesian network: Graphical representation of variables and how they relate to each other in terms of probabilities using prior and current domain knowledge

  • Bayesian decision theory: Decision-making based on prior knowledge and current probabilities to optimize the use of all knowledge you have relating to a certain problem

What is Bayesian statistics used for, and how?

Any professional field that benefits from using prior knowledge to update predictions for future outcomes can employ Bayesian statistics. For example, you might use Bayesian statistics to create algorithms designed to detect fraud, filter spam, diagnose medical conditions, predict weather, or conduct forensic analysis. While the potential applications are widespread, certain industries already use Bayesian statistics in everyday processes, such as the following:

Health care

In health care, Bayesian statistics can be used in clinical trials to update the probabilities of certain patient outcomes and treatment efficacy. As more information becomes available, such as policy decisions, regulatory approval, other patient outcomes, and treatment recommendations, Bayesian statistics provide a way to incorporate several types of information to make the most informed predictions. 

Finance

Calculating the risk of lending money to potential borrowers and forecasting financial outcomes using various elements such as past observations, known statistics, and domain updates are both ways in which you might use Bayes’ theorem in the financial industry.

Marketing

Bayesian statistics can even help develop recommendation systems that suggest products to customers based on prior behavior and assess their satisfaction based on the product match. Additionally, this type of statistics can help determine market segmentation and targeted marketing strategies by incorporating past data, such as purchasing patterns and behavior profiles, into the marketing process.

Machine learning and artificial intelligence

Bayesian statistics are important in developing algorithms for machine learning and artificial intelligence. These methods help optimize models based on updated data, allowing the algorithm to revise the relationship between variables and make more accurate decisions and predictions.

Advantages and disadvantages of Bayesian statistics

To decide whether to choose Bayesian statistics over other available methods, consider the advantages and disadvantages of this methodology.

Advantages

One of the main advantages of Bayesian statistics is its ability to incorporate both current and past conditions when evaluating hypotheses. By recognizing that “something is going on” even at baseline, Bayesian methods allow for a more nuanced understanding of data and relative probabilities of different outcomes, accounting for various underlying factors. You can update and revise Bayesian models to create updated predictions when new information becomes available. This makes them well-suited for the dynamic, real-time decision-making scenarios that clinical trials or machine learning prediction algorithm development presents.

Disadvantages

Conversely, Bayesian statistics rely heavily on the observed data and assumptions that, in part, make up prior probability distributions. If these priors are subjective, poorly chosen, or based on false information, they can introduce bias or lead to misleading conclusions. The quality and accuracy of Bayesian analysis depend on the quality and accuracy of the prior information, which can sometimes be difficult to determine objectively.

Seeing Bayes’ theorem in action

You can begin your journey into Bayesian statistics by understanding its foundational formula: Bayes’ theorem. This formula provides a way to calculate posterior probabilities (updated probabilities) using new data and prior knowledge. The formula is as follows: P(A ∣ B) = (P(B ∣ A) ⋅ P(A)) / P(B)

Where:

  • P(A | B): the probability of event A given event B is true

  • P(B | A): the probability of event B given event A is true

  • P(A): the prior probability of event A

  • P(B): the prior probability of event B

To understand how to use this formula professionally, consider the following example: In this scenario, you want to predict whether a certain food brand will increase profits by at least 10 percent next year, given the company has reformulated its product. You know that, of the companies who increased sales by 10 percent last year, 40 percent reformulated their products. You also know that only 20 percent of companies increased their sales by at least 10 percent last year, and 15 percent of all companies reformulated their product. With this information, you can say:

P(A | B): The probability of a company increasing its profits, given it reformulated the product. This is what you are finding.

P(A): The probability of any company increasing profits = 20 percent, or 0.20

P(B): The probability of any company reformulating its product = 15 percent, or 0.15

P(B | A): The probability of a company reformulating, given it increased its profits by 10 percent = 40 percent, or 0.40

Using Bayes theorem: 

P(A | B) = (0.40 * 0.20) / (0.15) = 0.53 or 53 percent

From this, you know that the probability that a company increased profits by at least 10 percent, given it reformulated its product, is approximately 53.33 percent. While this is a simple example, mastering basic Bayesian methodologies can provide an understanding of what is happening behind the scenes and can help set you up to perform more complex analyses.

Keep learning statistics on Coursera

Bayesian statistics consider prior probabilities and events to update predictions based on a variety of related knowledge and have applications across fields, including machine learning, health care, and finance. To continue learning about Bayesian statistics and related topics, consider the Bayesian Statistics Specialization. You can learn concepts at your own pace, including Bayesian inference, hierarchical modeling, and time series forecasting. Upon completion, gain a shareable Professional Certificate to include in your resume, CV, or LinkedIn profile.

Article sources

1

Britannica. “Thomas Bayes, https://www.britannica.com/biography/Thomas-Bayes.” Accessed December 11, 2024.

Keep reading

Updated on
Written by:

Editorial Team

Coursera’s editorial team is comprised of highly experienced professional editors, writers, and fact...

This content has been made available for informational purposes only. Learners are advised to conduct additional research to ensure that courses and other credentials pursued meet their personal, professional, and financial goals.