0

So I wrote Linear regression from scratch using y = mx+b and ran the algorithm for 50 epochs (times) to minimize the cost and get the best parameters.

When I use Scikit Learn, I just call the Linear Regression method and fit the data-set to it then start predicting. How many epochs does fit method run ? Not only for Linear Regression but also other ML methods in general.

John Constantine
  • 697
  • 2
  • 8
  • 10
  • Refer to this question: https://datascience.stackexchange.com/questions/28884/finding-perfect-weights-for-models/28894#28894 – JahKnows Mar 14 '18 at 02:28

1 Answers1

5

In scikit-learn's linear regression, the parameters that minimise the squared error loss aren't estimated using gradient descent, they are computed exactly.

The minimisation problem for linear least squares is

$$ \hat{\beta} = \underset{\beta}\arg \min || \mathbf{y} - \beta \mathbf{X} ||^2 $$

which has a unique solution (assuming the columns of $\mathbf{X}$ are linearly independent):

$$ \hat{\beta} = (\mathbf{X}^\text{T}\mathbf{X})^{-1}\mathbf{X}^\text{T}\mathbf{y} $$

For classifiers that are fitted with an iterative optimisation process like gradient descent, e.g., MLPClassifier, there is a parameter called max_iter which sets the maximum number of epochs. If tol is set to 0, the optimisation will run for max_iter epochs.

timleathart
  • 3,940
  • 21
  • 35
  • Suppose I am using Random Forest etc, how many epochs does it run for that ? – John Constantine Mar 13 '18 at 23:41
  • Random forests also don't use gradient descent to train, so the concept of epochs is not relevant. I've updated my answer to explain about classifiers that do use epochs. – timleathart Mar 14 '18 at 00:22
  • For trees and other related(both are different), n_estimators,depth,minsampleleaves, bootstrap,OOB,class_weught etc play a role,though CatBoost has iterations number – Aditya Mar 14 '18 at 00:45