Questions tagged [training]

Training is the part of machine learning whereby a model is "trained" on a define portion of a dataset to learn attributes and statistical features of the data. It's counterparts are called Testing and Validation. After training a model is tested and validated on another portion of the dataset.

694 questions

votes

1 answer

Validation data shall be in broken down into batches or not?

I am using fit_generator to train the model. The training dataset is being read from a generator function which gives data in a constant batch size. Now I want to know what approach shall I adopt for validation data. Shall I make a generator for the…

training

asked Mar 17 '19 at 05:58

yamini goel

votes

1 answer

Running multiple times of a model is for model randomness or data randomness?

When a paper report the average and std of a model on a dataset, it means that they have changed the split of training and test sets and run the model multiple times or they just run the model on constant splitting multiple times to find the…

training

asked Apr 19 '19 at 12:39

user137927

vote

1 answer

Applying the same changes to the test set

I'm busy working through Aurélien Géron's book. (Hands-On Machine Learning with Scikit-Learn, Keras, and Tensorflow) The idea is to split the data into train and test set as early as possible in order to avoid data snooping bias. Afterwards changes…

training

asked May 24 '21 at 14:29

Neal Liddle

vote

1 answer

libraries for multiple machine NN training?

As detailed here, the way to go to break NN training over multiple machines/threads, is decompose training data set on multiple chunks and send to each node, then sum results back in main node. There is some library who already implements these…

training

asked Aug 29 '20 at 09:24

Rogelio Triviño

vote

0 answers

SGD Convergence in neural networks

does initialization play a role in finding a solution during training of neural networks in SGD convergence

training

asked Mar 26 '20 at 21:03

user3159445

vote

0 answers

Inconsistent validation accuracy? is that expected?

I am currently training a pattern classification network, and seem to get very inconsistent result. The network seem to overfit, but the validation accuracy is very inconsistent. I am currently trying to do pattern recognition on audio…

training

asked Jun 17 '17 at 17:42

Carlton Banks

vote

0 answers

Most popular frameworks for distributed training of pytorch

I've done mostly single GPU training using PyTorch. I've decided recently I wish to use a distributed approach for model training on a cluster with GPUs. But I'm unsure what framework to use. I gather that, while Spark is often a preferred tool for…

training

asked Apr 24 '23 at 04:23

Stan Shunpike

votes

1 answer

Best choice for splitting data given a quantity and a expected accuracy

I have a dataset with at least 1,000,000 images (from IDs) which I am using to detect the presence of sealed IDs. The legacy algorithm got nearly 60% accuracy, but my current algorithm yielded almost 80% on a small set. There must be some logic for…

training

asked Jul 20 '21 at 17:52

Manu

votes

1 answer

How to use train_test_split with existing dataset?

I am looking for an example of how to use train_test_split with an existing dataset. I have a CSV that can be bought into a dataset with: data = pd.read_csv('c:\MyData.csv') My aim is to use this data with a One-Class SVM. When I look at examples…

training

asked Oct 18 '21 at 17:23

Colin Crook

votes

1 answer

Train Test Split procedure

Given that the sample size is small (roughly 2,700 observations), I wanna do a multiclass classification. Should I use the full sample instead of the train test split?

training

asked Sep 25 '21 at 14:47

pallidness