0

enter image description here

Hi Everyone,

I'm using an email verification software to check if emails in my dataset are valid or not. How do I use pandas to match the email in the column named "Valid Email" to the column "Email" and return the value in "Good_Email"? To be clear the result could look like

              Good_Email

              [email protected]
              nan
              [email protected]
              nan

Also lets assume that the column named "Valid Email" will be ordered randomly compared to the column "Email" and can be shorter in length. (i.e. Exactly as presented in the example).

Thanks in advance!

1 Answers1

2

You can use .where() together with .isin(), as follows:

df['Good_Email'] = df['Email'].where(df['Email'].isin(df['Valid Email']))

Demo

data = {'Name': ['Bob', 'Jack', 'Tom', 'Mike'],
 'Email': ['[email protected]', np.nan, '[email protected]', '[email protected]'],
 'Valid Email': ['[email protected]', '[email protected]', np.nan, np.nan]}

df = pd.DataFrame(data)

df['Good_Email'] = df['Email'].where(df['Email'].isin(df['Valid Email']))

Output:

print(df)

   Name             Email       Valid Email        Good_Email
0   Bob  [email protected]   [email protected]  [email protected]
1  Jack               NaN  [email protected]               NaN
2   Tom   [email protected]               NaN   [email protected]
3  Mike    [email protected]               NaN               NaN

SeaBean
  • 22,547
  • 3
  • 13
  • 25