It has been asked here if we should repeat lengthy experiments.
Let's say I can repeat them, how should I present them? For instance, if I am measuring the accuracy of a model on test data during some training epochs, and I repeat various times this training, I will have different values of test accuracy. I can average them to take into account all the experiments. Can I then calculate a sort of confidence interval to say that the accuracy will most likely be within an interval? Does this make sense? If it does, what formula should I use?
It says here that we can use $\hat{x} \pm 1.96 \frac{\hat{\sigma}}{\sqrt{n}}$, but I don't quite understand the theory behind.
cv_errors = cross_val_scores(GridSearchCV(MyModel()), X, y)
. In this case, how should we present the distribution ofcv_errors
? In my answer, I assume that 95% CI makes sense. – Sanjar Adilov May 21 '22 at 08:13