Elements of Statistical Learning - question on p. 12

Question

I am starting to work through Elements of Statistical Learning, and right off the bat I am coming across things that I don't understand. I would be grateful for any help from this community. Please let me know if this is not the appropriate forum to post these questions (in which case, if you're feeling extra-nice, please point me to the correct forum).

On page 12, the authors present the familiar expression for linear regression:

Y-hat = X-transpose * β

The authors then state:

Here we are modeling a single output, so Y-hat is a scalar; in general Y-hat can be a K–vector, in which case β would be a p×K matrix of coeﬃcients. In the (p + 1)-dimensional input–output space, (X, Y-hat) represents a hyperplane. If the constant is included in X, then the hyperplane includes the origin and is a subspace;

Questions:

Is the input-output space only p+1 dimensional assuming that Y is a scalar, rather than a K-vector? If Y is a K-vector, would the input-output space be p+K dimensional?
What does the statement that "(X, Y-hat) represents a hyperplane" mean? Assuming that X-transpose is a single column vector representing the fact that there is only one input variable, could you help me visualize what the hyperplane would look like?

Thank you in advance!

score 2 · Accepted Answer · answered May 27 '19 at 00:49

You are correct, at p+1 it is returning to talk about the K=1 case again. p inputs produce 1 output. The linear function $\beta$ you fit maps p dimensional X to a p dimensional subspace of the p+1 dimensional space that includes the 'output' and it's specifically just a hyperplane. Try plotting $z = 3x + 2y$ in a graphing calculator for a quick intuition of what single output from p=2 inputs feels like.

Elements of Statistical Learning - question on p. 12

1 Answers1