3

To do linear regression there is good answer from TecHunter

Slope;

$$\alpha = {n\sum(xy) - \sum x \sum y \over n\sum x^2 - (\sum x)^2}$$

Offset:

$$\beta = {\sum y - \alpha \sum x \over n}$$

Trendline formula:

$$y = \alpha x + \beta $$

However, How does these formulas change when I want to force interception at origin ? I want $y=0$ when $x=0$, so model is: $$y = \alpha x $$

grand_chat
  • 38,951
edgarmtze
  • 437
  • Do you mean find the $\alpha$ such that $\sum (y_i - \alpha x_i)^2$ is minimized where $(x_i,y_i)$ are the data points? So even if the data are like $(i,1)$ which should give the horizontal line $y=1$, you will give a terrible answer because it has to pass through the origin. – AHusain Jul 18 '19 at 18:07

2 Answers2

9

To fit the zero-intercept linear regression model $y=\alpha x + \epsilon$ to your data $(x_1,y_1),\ldots,(x_n,y_n)$, the least squares estimator of $\alpha$ minimizes the error function $$ L(\alpha):=\sum_{i=1}^n(y_i-\alpha x_i)^2.\tag1 $$ Use calculus to minimize $L$, treating everything except $\alpha$ as constant. Differentiating (1) wrt $\alpha$ gives $$ L'(\alpha) = \sum2(y_i-\alpha x_i)(-x_i)=-2\left(\sum x_iy_i - \alpha\sum x_i^2\right).\tag2 $$ Setting (2) to zero yields the equation $$ \sum x_iy_i=\alpha\sum x_i^2\tag3 $$ which you can solve for $\alpha$ to obtain the estimator for the slope: $$\hat\alpha = \frac{\sum x_iy_i}{\sum x_i^2}. $$ Remember to check that the second derivative of $L$ is positive, to confirm that $L$ is minimized for this $\hat\alpha$. Indeed, $L''(\alpha)=2\sum x_i^2$, which doesn't depend on $\alpha$, and is positive except in the degenerate case where all the $x$'s are exactly zero. In that case you will agree that there's no unique line passing through the origin that best fits the data.

grand_chat
  • 38,951
-2

Since $\beta=0$, $\alpha = \frac{\sum y}{\sum x}$