Skip to main content

Comparison between Logistic Regression and Support Vector Machine


Logistic regression and support vector machine (SVM) are both popular models that can be applied to classification tasks. This article gives the introduction of these two methods and summarizes the differences between them.

What is Logistic Regression?

Logistic regression is a generalized linear model for binary classification. In logistic regression, we take the output of the linear function and then pass the value to the sigmoid function. The sigmoid function is S-shaped,  it is a bounded and differentiable activation function.  We use sigmoid function in logistic regression because it can take any real-valued number and map it into a value between the range of 0 and 1, as is known to all, the probability of any event is between 0 and 1, so sigmoid function is an intuitive and right choice for logistic regression. After we get the probabilities, we then set a threshold to make decisions, if the probability is greater than the threshold, we assign it a label 1, else we assign it a label 0.


Image source: https://www.saedsayad.com/images/LogReg_1.png


What is Support Vector Machine (SVM)?

A support vector machine makes classifications by using the hinge loss function to find the optimal hyperplane that maximizes the margin between the classes. Data points falling on either side of the hyperplane can be attributed to different classes. 

Support Vector Machine (SVM) Algorithm - Javatpoint
https://static.javatpoint.com/tutorial/machine-learning/images/support-vector-machine-algorithm.png


What's the difference between logistic regression and support vector machine?

  • They are different in loss function. Logistic regression minimize log loss function while SVM minimizes hinge loss function.
Log loss function
Hinge loss function

  • When we have a perfect linearly separable dataset, SVM can easily find the optimal hyperplane and all the data points can be classified correctly, while logistic regression will have difficulty to converge, thus failing to make classifications.
  • Logistic regression is more sensitive to outliers than SVM. 
    • Because logistic  regression finds its boundary by including all the data, so some outliers can make a difference to the boundary. 
    • SVM finds its maximal margin hyperplane by several support vectors that lie along the lines indicating the width of the maximal margin, so outliers that are far away from the margin have no effects on its decision boundary.
  • SVM do not directly provide probabilities (values between 0 and 1), while logistic regression can easily produce probabilities. 
    • This might be a good property for logistic regression when what we want is an estimation, instead of absolute predictions, or when we don't have enough confidence into the data.
  • SVM is more flexible than logistic regression.
    • With different kernels (RBF, POLY, etc), SVM can learn different patterns in a non-linear dataset, however, it can be tricky to find the most appropriate kernel sometimes. 
    • Logistic regression assumes linearity between the log odds of an event occurring and predictor variables, so it may not perform well when we have a complex non-linear dataset.
  • SVM woks better for high dimensional data.
    • SVM is computationally efficient (with kernel tricks), especially when working with higher dimensional spaces.
  • SVM works well with unstructured and semi-structured data (like text and images), while logistic regression works with already identified independent variables.



Comments

Post a Comment

Popular posts from this blog

Something About Cross Validation

As we all know, it is important to split the dataset into a training set, validation set and test set before building a model. The reasons for the preprocessing are: To avoid data leakage and prevent overfitting problem To tune the hyperparameters in the model to achieve better predictive performance Each of the dataset after the split plays a different role in the model building. Training set is used directly in the model training process. Validation set is also used in training the model but mainly to tune the model hyperparameters and choose the best model. Test set is only used when the model is completely trained, and it is used to evaluate the final model performance. If you simply use the splitted training set, validation set to train / tune the model, and use the test set to evaluate the model ( validation set approach ), you can only have one or  at most two estimates of the model performance. What’s more, since you reserve a certain percentage of the data for validation, ...

Build an autocorrect model to correct misspelled words

Autocorrect is an application that changes misspelled words into correct ones. You have it on the phone and on your computer inside your document editors, as well as email applications. For example, it takes a sentence like this one, “happy birthday deah friend.” , and corrects the misspelled word deah to dear . We can use a simple yet powerful model to solve this kind of problem. One thing to notice here is that you are not searching for contextual errors, just spelling errors,  So words like deer will pass through this filter, as it is spelled correctly, regardless of how the context may seem. This simple yet powerful autocorrect model works in 4 key steps: Identify a misspelled word Find strings n edit distance away Filter candidates Calculate word probabilities Now, let’s look at the details of how each step is implemented. Step 1: Identify a misspelled word How does the machine know if a word is misspelled or not?  Simply put, if a word is not given in a dictionary, th...