Create a Pandas DataFrame with multiple one-hot-encoded columns
Let's say you have a Pandas dataframe flags
with many columns you want to one-hot-encode.
You want a Pandas dataframe flags_ohe
, which has the same columns as flags
, but columns 'Mainhue', 'Landmass','Zone','Language','Religion', 'Topleft', 'Botright'
are replaced with one-hot-encoded versions with clear column names such as Mainhue_red
and Mainhue_blue
.
flags_ohe = flags
categorical_columns = ['Landmass','Zone','Language','Religion',
'Mainhue', 'Topleft', 'Botright']
for col in categorical_columns:
col_ohe = pd.get_dummies(flags[col], prefix=col)
flags_ohe = pd.concat((flags_ohe, col_ohe), axis=1).drop(col, axis=1)
Here's before.
print(flags.columns)
Output:
Index(['Name', 'Landmass', 'Zone', 'Area', 'Population', 'Language',
'Religion', 'Bars', 'Stripes', 'Colors', 'Red', 'Green', 'Blue', 'Gold',
'White', 'Black', 'Orange', 'Mainhue', 'Circles', 'Crosses', 'Saltires',
'Quarters', 'Sunstars', 'Crescent', 'Triangle', 'Icon', 'Animate',
'Text', 'Topleft', 'Botright'],
dtype='object')
dtype='object')
Here's after.
print(flags_ohe.columns)
Output:
Index(['Name', 'Area', 'Population', 'Bars', 'Stripes', 'Colors', 'Red',
'Green', 'Blue', 'Gold', 'White', 'Black', 'Orange', 'Circles',
'Crosses', 'Saltires', 'Quarters', 'Sunstars', 'Crescent', 'Triangle',
'Icon', 'Animate', 'Text', 'Landmass_1', 'Landmass_2', 'Landmass_3',
'Landmass_4', 'Landmass_5', 'Landmass_6', 'Zone_1', 'Zone_2', 'Zone_3',
'Zone_4', 'Language_1', 'Language_2', 'Language_3', 'Language_4',
'Language_5', 'Language_6', 'Language_7', 'Language_8', 'Language_9',
'Language_10', 'Religion_0', 'Religion_1', 'Religion_2', 'Religion_3',
'Religion_4', 'Religion_5', 'Religion_6', 'Religion_7', 'Mainhue_black',
'Mainhue_blue', 'Mainhue_brown', 'Mainhue_gold', 'Mainhue_green',
'Mainhue_orange', 'Mainhue_red', 'Mainhue_white', 'Topleft_black',
'Topleft_blue', 'Topleft_gold', 'Topleft_green', 'Topleft_orange',
'Topleft_red', 'Topleft_white', 'Botright_black', 'Botright_blue',
'Botright_brown', 'Botright_gold', 'Botright_green', 'Botright_orange',
'Botright_red', 'Botright_white'],
dtype='object')