Pivoting a two-column feature table in Pandas

Question

How can I transform the following DataFrame into one with cities as rows and each cuisine as a column, and 1 or 0 as values (1 if the city has that kind of cuisine)?

I think this turns out to be a very common problem in transforming data into features for machine learning.

I am aware of the Pandas pivot_table functionality, but it asks for a value column, and in this case we don't have any.

import pandas as pd

data = {
    'city': ['NY','NY', 'SF','SF','SF'],
    'cuisine': ['Japanese', 'Chinese', 'French', 'Japanse', 'German']
}

df = pd.DataFrame(data)

score 8 · Accepted Answer · answered Jul 06 '15 at 17:11

If there is no value column - introduce it yourself!

df["value"]=1
pd.pivot_table(df, values="value", index=["city"], columns="cuisine", fill_value=0)

For your example I got (after fixing the misprint in 'Japanse' to 'Japanese')

cuisine  Chinese  French  German  Japanese
city                                      
NY             1       0       0         1
SF             0       1       1         1

score 4 · Answer 2 · answered Apr 14 '17 at 03:52

4

Here is a more pythonic way:

df.pivot_table(index=["city"], columns="cuisine", aggfunc=lambda x: 1, fill_value=0)

This is a late answer, but it is for people who meet the same probelm.

answered Apr 14 '17 at 03:52

sac7e

41
1

score 0 · Answer 3 · answered Sep 23 '22 at 07:20

In recent Pandas versions there is a method for exactly this use-case:

https://pandas.pydata.org/docs/reference/api/pandas.get_dummies.html

pd.get_dummies(df.set_index(["city"]).cuisine)

Result:

      Chinese  French  German  Japanese  Japanse
city                                            
NY          0       0       0         1        0
NY          1       0       0         0        0
SF          0       1       0         0        0
SF          0       0       0         0        1
SF          0       0       1         0        0

Pivoting a two-column feature table in Pandas

3 Answers3