I have a data set where some rows are same but belong to different classes. Example -
index | Heading 1 | Heading 2 | Heading 1b | Heading 2b | Class/Target |
---|---|---|---|---|---|
row -1 | a | b | c | d | 0 |
row -2 | t | r | f | k | 0 |
row -3 | m | u | p | l | 0 |
row -4 | a | b | c | d | 1 |
row -5 | m | u | p | l | 1 |
row -6 | v | r | z | h | 0 |
row -7 | z | q | y | o | 1 |
row -8 | w | e | t | a | 1 |
row-1 and row-4 are same rows but with different class. Similar case with row-3 and row-5 There are only two classes.
I want to make those rows to new class say for example -2 It will look like this:
index | Heading 1 | Heading 2 | Heading 1b | Heading 2b | Class/Target |
---|---|---|---|---|---|
row -1 | a | b | c | d | 2 |
row -2 | t | r | f | k | 0 |
row -3 | m | u | p | l | 2 |
row -4 | a | b | c | d | 1 |
row -5 | m | u | p | l | 2 |
row -6 | v | r | z | h | 0 |
row -7 | z | q | y | o | 1 |
row -8 | w | e | t | a | 1 |
We can see those rows are mapped to 2. And the duplicates are also kept in the same order. Previously, I use iloc and iterate. But it takes huge amount of time as the size of the data set is huge. So, I converted into dictionary, it was fine and fast. But it requires bit of manipulation and more coding work. I would like to know how can it be done in a simple way.