Trying to create a new data_frame
based on the order date column and emails. So if I have a duplicated email (e.g. [email protected] in the example below), I want to merge the emails and place the order_date variable in a new column next to it. I want to do this in the full DF. This will introduce many NAs but i will solve that problem later.
I have a dataframe as follows:
Source: local data frame [6 x 4]
Groups: email [5]
email order_date `sum(price_excl_vat_euro)` `sum(total_qty)`
<chr> <date> <dbl> <int>
1 [email protected] 2016-09-05 140.48 2
2 [email protected] 2016-11-01 41.31 1
3 [email protected] 2016-09-18 61.98 1
4 [email protected] 2016-08-01 61.98 1
5 [email protected] 2016-08-02 61.98 1
6 [email protected] 2016-08-02 140.49 1
What i want to obtain is (the other columns i do not care about for now):
email order_date1 order_date2
[email protected] 2016-09-05 NA
[email protected] 2016-11-01 NA
[email protected] 2016-09-18 NA
[email protected] 2016-08-01 2016-08-02
[email protected] 2016-08-02 NA
It is important to know that the number of orders could vary between 1-10 (average). I tried the spread
function from the tidyr
package. But couldn't get it to work. Any hints are very appreciated!