-1

Had a dataset like :

mail id          score
[email protected]     10
[email protected]     13
[email protected]     16
[email protected]     20
[email protected]     19
[email protected]     24

From the above data, have to remove duplicate values by comparing the score column.

Eg: In mail column we have 2 [email protected] and [email protected]. Here, we need remove duplicate values by comparing there score.

For [email protected] had score 10 & 16 then it should return the greate value row.

output:

mail id          score
[email protected]     16
[email protected]     20
[email protected]     19
[email protected]     24
manoj kumar
  • 105
  • 5

1 Answers1

1

Use sort_values() method and drop_duplicates() method:

resultdf=df.sort_values('score',ascending=False).drop_duplicates('mail id')

OR

You can also do this by groupby() method:

resultdf=df.groupby('mail id')['score'].nlargest(1).droplevel(1).reset_index()
Anurag Dabas
  • 23,866
  • 9
  • 21
  • 41