3

I am starting to work through Elements of Statistical Learning, and right off the bat I am coming across things that I don't understand. I would be grateful for any help from this community. Please let me know if this is not the appropriate forum to post these questions (in which case, if you're feeling extra-nice, please point me to the correct forum).

On page 12, the authors present the familiar expression for linear regression:

Y-hat = X-transpose * β

The authors then state:

Here we are modeling a single output, so Y-hat is a scalar; in general Y-hat can be a K–vector, in which case β would be a p×K matrix of coefficients. In the (p + 1)-dimensional input–output space, (X, Y-hat) represents a hyperplane. If the constant is included in X, then the hyperplane includes the origin and is a subspace;

Questions:

  1. Is the input-output space only p+1 dimensional assuming that Y is a scalar, rather than a K-vector? If Y is a K-vector, would the input-output space be p+K dimensional?
  2. What does the statement that "(X, Y-hat) represents a hyperplane" mean? Assuming that X-transpose is a single column vector representing the fact that there is only one input variable, could you help me visualize what the hyperplane would look like?

Thank you in advance!

1 Answers1

2

You are correct, at p+1 it is returning to talk about the K=1 case again. p inputs produce 1 output. The linear function $\beta$ you fit maps p dimensional X to a p dimensional subspace of the p+1 dimensional space that includes the 'output' and it's specifically just a hyperplane. Try plotting $z = 3x + 2y$ in a graphing calculator for a quick intuition of what single output from p=2 inputs feels like.

Sean Owen
  • 6,595
  • 6
  • 31
  • 43