0

Popular models such as the transformer model use positional encoding on existing feature dimensions. Why is this preferred over adding more features to the feature dimension of the tensor which can hold the positional information?

kot
  • 11
  • 1

0 Answers0