0

How to remove the rows that have the value of a column repeated more than 2 times. It could or not be consecutive. Like:

NAME      EMAIL
Joe       [email protected]
John      [email protected]
Eric      [email protected]
Melissa   [email protected]
Ron       [email protected]

I would like to remove all rows with [email protected] because it repeats more than 2 times.

sgobin
  • 13
  • 3
  • Does this [Python: Removing Rows on Count condition](https://stackoverflow.com/questions/49735683/python-removing-rows-on-count-condition) solve your problem? – tidakdiinginkan Apr 15 '20 at 18:15

1 Answers1

0

Create your dataframe

import pandas as pd
import numpy as np

data = {'Name': ['Michael', 'Larry', 'Shaq', 'barry'], 'email': ['[email protected]', '[email protected]', '[email protected]', '[email protected]'] }

df1 = pd.DataFrame.from_dict(data)

print(df1)

      Name           email
0  Michael   [email protected]
1    Larry  [email protected]
2     Shaq   [email protected]
3    barry   [email protected]

Then filter it by values in a column that are greater than 2

fil =  df1.groupby('email').filter(lambda x : len(x)<2)

print(fil)

    Name           email
1  Larry  [email protected]
sanjayr
  • 1,679
  • 2
  • 20
  • 41