0

I have tabular data. Arbitary amount of rows can form a certain class. Data contains multiple instances of each class. I want each class instance to contain its own number. What is the name of this task so that I can further search for information on this topic? It's not classification, it's not clustering as far as I know. It's kind a like detection.

1 Answers1

2

I suppose you refer to data labeling or encoding.

For instance, if count(rows) > 5, then class = 1. Such rules are generally programmed using pandas or numpy.

https://sparkbyexamples.com/pandas/pandas-groupby-count-examples/#:~:text=You%20can%20use%20pandas%20DataFrame,count%20for%20each%20group%20combination.

On the other hand, if you want each instance to contain its own number, you can use a label encoder: https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.LabelEncoder.html

See also:

https://www.geeksforgeeks.org/ml-label-encoding-of-datasets-in-python/

https://www.mygreatlearning.com/blog/label-encoding-in-python/

Nicolas Martin
  • 4,674
  • 1
  • 6
  • 15