Questions tagged [training]

Training is the part of machine learning whereby a model is "trained" on a define portion of a dataset to learn attributes and statistical features of the data. It's counterparts are called Testing and Validation. After training a model is tested and validated on another portion of the dataset.

Training is the part of machine learning whereby a model is "trained" on a define portion of a dataset to learn attributes and statistical features of the data. It's counterparts are called Testing and Validation. After training a model is tested and validated on another portion of the dataset.

694 questions
4
votes
1 answer

Validation data shall be in broken down into batches or not?

I am using fit_generator to train the model. The training dataset is being read from a generator function which gives data in a constant batch size. Now I want to know what approach shall I adopt for validation data. Shall I make a generator for the…
yamini goel
  • 731
  • 3
  • 7
  • 14
3
votes
1 answer

Running multiple times of a model is for model randomness or data randomness?

When a paper report the average and std of a model on a dataset, it means that they have changed the split of training and test sets and run the model multiple times or they just run the model on constant splitting multiple times to find the…
user137927
  • 379
  • 1
  • 3
  • 11
1
vote
1 answer

Applying the same changes to the test set

I'm busy working through Aurélien Géron's book. (Hands-On Machine Learning with Scikit-Learn, Keras, and Tensorflow) The idea is to split the data into train and test set as early as possible in order to avoid data snooping bias. Afterwards changes…
1
vote
1 answer

libraries for multiple machine NN training?

As detailed here, the way to go to break NN training over multiple machines/threads, is decompose training data set on multiple chunks and send to each node, then sum results back in main node. There is some library who already implements these…
1
vote
0 answers

SGD Convergence in neural networks

does initialization play a role in finding a solution during training of neural networks in SGD convergence
1
vote
0 answers

Inconsistent validation accuracy? is that expected?

I am currently training a pattern classification network, and seem to get very inconsistent result. The network seem to overfit, but the validation accuracy is very inconsistent. I am currently trying to do pattern recognition on audio…
Carlton Banks
  • 619
  • 1
  • 6
  • 26
1
vote
0 answers

Most popular frameworks for distributed training of pytorch

I've done mostly single GPU training using PyTorch. I've decided recently I wish to use a distributed approach for model training on a cluster with GPUs. But I'm unsure what framework to use. I gather that, while Spark is often a preferred tool for…
0
votes
1 answer

Best choice for splitting data given a quantity and a expected accuracy

I have a dataset with at least 1,000,000 images (from IDs) which I am using to detect the presence of sealed IDs. The legacy algorithm got nearly 60% accuracy, but my current algorithm yielded almost 80% on a small set. There must be some logic for…
Manu
  • 103
  • 3
0
votes
1 answer

How to use train_test_split with existing dataset?

I am looking for an example of how to use train_test_split with an existing dataset. I have a CSV that can be bought into a dataset with: data = pd.read_csv('c:\MyData.csv') My aim is to use this data with a One-Class SVM. When I look at examples…
0
votes
1 answer

Train Test Split procedure

Given that the sample size is small (roughly 2,700 observations), I wanna do a multiclass classification. Should I use the full sample instead of the train test split?