This is a very simple question, but I am unable to find my answer in other posts. For example: - Merging two dataframes, removing duplicates and aggregation in R - Merge two dataframes with repeated columns
Both csv files have the same column names (e.g., name, email, status, etc.). The first csv is a master list of names and emails. The second is a list of individuals who have RSVP'd to an event. I want to merge the two data frames, remove any duplicates, and then mutate to create a new column for RSVP == "yes/no". I imagine you use one of dplyr's functions in this problem, but unsure if it would be a full_join()
or inner_join()
. To give an example:
status names email company
1 invited John Smith [email protected] Company A
2 invited Abbi Maureen [email protected] Company B
3 invited Sara Doe [email protected] Company C
4 invited Maria Gonzalez [email protected] Company D
5 invited Frank Russell [email protected] Company E
The second csv is a list of individuals who confirmed their attendance, their status marked with RSVP
.
status names email company
1 RSVP Abbi Maureen [email protected] company B
2 RSVP John Smith [email protected] Company A
I am stuck in determining how to best merge these two data frames, remove any duplicates, and then create a new column (i.e., RSVP yes no)? Would it be full_join and then mutate?