SVM, LR typically give out similar results.

  1. LR is probabilistic, while SVM is non-probabilistic binary classifier (there are ways of get around of this)
  2. SVM is determined by support vectors (points lie between soft margin, this region size is determined by C/lambda), while RL is affected by all points. This is only true when no kernel is applied. Some ppl says SVM is less sensitive to outlier, while I also see opposite statement.
  3. Due to same above reason, linear SVM (no kernel) needs normalization,while LR does not. It is also likely SVM may have worse performance than LR due to the complex space distance measurement in high dimensional space.
  4. when applied with kernel tricks, it is found that SVM hold higher sparsity. Thus, SVM is better in computational complexity. (this is commonly brought up, but did not see why it is though).