Pandas: how can I create multi-level columns

Question

I have a pandas DataFrame which has the following columns:

n_0
n_1
p_0
p_1
e_0
e_1

I want to transform it to have columns and sub-columns:

I've searched in the documentation, and I'm completely lost on how to implement this. Does anyone have any suggestions?

score 7 · Answer 1 · answered Nov 10 '19 at 16:09

7

columns=[('0', 'n'), ('0', 'p'), ('0', 'e'), ('1', 'n'), ('1', 'p'), ('1', 'e')]

df.columns = pd.MultiIndex.from_tuples(columns)

answered Nov 10 '19 at 16:09

Leopold

171
1
2

score 2 · Answer 2 · edited Jul 31 '20 at 14:54

2

I had to adjust victor's sort to get OP's specific column format:

df = df.sort_index(level=0, axis=1)

0                                     1
    e           n           p           e           n           p
0   -0.995452   -3.237846   1.298927    -0.269253   -0.857724   -0.461103```

edited Jul 31 '20 at 14:54

Zephyr

997
4
10
20

answered Jul 20 '18 at 06:32

Trenton

121
2

score 2 · Accepted Answer · answered Dec 21 '15 at 12:44

Finally, I found a solution.

You can find the example script below.

#!/usr/bin/env python3
import pickle
import pandas as pd
import itertools
import numpy as np

data = pd.DataFrame(np.random.randn(10, 5), columns=('0_n', '1_n', '0_p', '1_p', 'x'))

indices = set()
groups = set()
others = set()
for c in data.columns:
    if '_' in c:
        (i, g) = c.split('_')
        c2 = pd.MultiIndex.from_tuples((i, g),)
        indices.add(int(i))
        groups.add(g)
    else:
        others.add(c)
columns = list(itertools.product(groups, indices))
columns = pd.MultiIndex.from_tuples(columns)
ret = pd.DataFrame(columns=columns)
for c in columns:
    ret[c] = data['%d_%s' % (int(c[1]), c[0])]
for c in others:
    ret[c] = data['%s' % c]
ret.rename(columns={'total': 'total_indices'}, inplace=True)

print("Before:")
print(data)
print("")
print("After:")
print(ret)

Sorry for this...

victor · Answer 4 · 2016-03-14T16:42:16.703

-2

There is a simpler solution:

  data.columns = data.columns.str.split('_', expand=True)

To arrange column names one can also do:

 data.sort_index(axis=1, inplace=True)

To change column levels:

 data = data.reorder_levels([1,0], axis=1)

edited Mar 14 '16 at 16:42

answered Mar 14 '16 at 16:32

victor

1
1

Pandas: how can I create multi-level columns

4 Answers4