top of page

Logistic Regression for Dummies

Dr Dilek Celik

What is Logistic Regression?

Logistic regression is a type of model that helps us predict whether something will happen or not—like a yes or no question. For example:

  • Will it rain today? Yes or No.

  • Will a student pass the exam? Yes or No.

  • Is an email spam? Yes or No.


Why Can’t We Just Use Regular (Linear) Regression?

Regular (or linear) regression is good for predicting numbers, like predicting someone’s height based on their age. However, for "yes or no" situations, we need something different because linear regression can give us any number, positive or negative, which doesn’t make sense for "yes or no" answers.

In logistic regression, instead of directly predicting "yes" or "no," we predict the probability that something will happen. The probability is a number between 0 and 1:

  • 0 means "definitely no."

  • 1 means "definitely yes."

  • Numbers in between (like 0.7 or 0.3) mean "probably yes" or "probably no."


The Logistic Function (or Sigmoid Function)

To make sure our predictions are between 0 and 1, we use a special curve called the logistic function (also known as the sigmoid function). This function takes any number as input and squashes it into a range between 0 and 1.


The formula for the logistic function looks like this:

  • e is a mathematical constant (around 2.718), kind of like the number pi (π).

  • z is the input we get by combining the data with the model’s parameters (more on that below).


The Logistic Regression Formula

The logistic regression formula combines the input data with weights (numbers we learn from the data) to make a prediction. It looks like this:

  • b is the intercept, a number that shifts the entire function up or down.

  • w1,w2,…,wn are weights, numbers we multiply by each feature (or input) to see how much they contribute to the final result.

  • x1,x2,…,xn​ are the features (the different pieces of information about the situation we want to predict).


After we calculate z, we plug it into the logistic function:

This probability tells us how likely the answer is "yes." If the probability is greater than 0.5, we predict "yes"; if it’s less than 0.5, we predict "no."


Example to Make It Clearer

Let’s say we want to predict whether a student will pass a test based on two things:

Summary

  • Logistic regression predicts a probability between 0 and 1 for a "yes" or "no" answer.

  • Logistic function squashes the output to be between 0 and 1.

  • We combine the input features with weights to get z and use the logistic function to turn z into a probability.


1 view0 comments

Recent Posts

See All

Commentaires


bottom of page