Good day. Taking into account the picture shown, using tensor product is computationally expensive considering the fact that it has higher dimensions. I am just thinking why it is compared both additive and tensor products here (having the same training error, test error, bayes error values)?
Another thing I want to clarify is how did the formula for degrees of freedom for additive arrive as total df = 1 + (4-1) + (4-1) = 7? (Tensor product is a simple multiplication of 4x4).