We are planning to use facial landmark information as input to the model. Since there are more than 60 points, it doesn't look good to use 60 channels as inputs after one-hot encoding. I found a few papers with similar ideas, but I didn't like them.
What is the exact name of this methodology, and is there a well-known good way?