I'm busy working through Aurélien Géron's book. (Hands-On Machine Learning with Scikit-Learn, Keras, and Tensorflow)
The idea is to split the data into train and test set as early as possible in order to avoid data snooping bias. Afterwards changes are made to the data.
My question is that since changes were made to the training set, I assume the same changes(dropping columns, filling NA rows, converting categorical to numerical, etc) should be made to the test set before training and evaluating? If that is the case, what is the correct way to perform this? Write everything as a function and run it on both, which seems a bit counter intuitive to working with notebooks? Is there a built-in function that I'm not aware of?