Neural Network: One model per user or (one-hot) variable with one model?

Question

I have about 120 users with a total of 4500 data points. The minimum user has about 5 data points and the maximum has about 100 data points. I would like to build a model that will make predictions for each user.

What is the optimal approach? Do I create a single model for each user or do I create a single model with a categorical variable to specify the user?

I would imagine the single model approach would leverage the correlation between users, but the model per user approach might suffer from not enough data to generalize well.

I would consider a pooling method described here: possible duplicate, but there is not enough information in the independent variables to distinguish the users from one another, which is why I would have to create a categorical variable to distinguish users. Meaning, there are many users with the same input variables, but systematically different outputs.

The input variables include: arrival time, day of week, and temperature. The output variables include departure time and miles charged.

Have you considered making a joint model (i.e. one that applies to all users), or does it have to be personalized (i.e. separate output for each user)? — Djib2011, Aug 28 '18 at 10:23
Well, I think there is useful information in the user name as a variable. Otherwise, there will be multiple users with the same arrival time, however, they will have systematically different departure times, which is something I am trying to predict. — Leonard Strnad, Aug 29 '18 at 12:57
If you want to extract some information from the name (e.g. sex, nationality), I'd suggest doing it at a pre-processing step (e.g. through some rule or regular expressions) and storing that information in a separate variable. Otherwise, if you think your model is sophisticated enough to extract any useful information by itself you can pass the name as a variable. This seems to me like a typical structured ML problem. I don't see why you need every person to have its own variable. — Djib2011, Aug 29 '18 at 17:55
Is there any other data you can use specify user? I'm saying this because this variation of user data would somewhat account for why distribution is conditional on user. — Daniel, Aug 29 '18 at 20:40

score 0 · Answer 1 · answered Nov 18 '21 at 12:28

One option is to train the neural network with all the data. Then take that global model and fine-tune separate models for each individual customer.

This minimizes the cold-start problem for new customers while creating custom predictions for the unique properties of each customer.

Neural Network: One model per user or (one-hot) variable with one model?

1 Answers1