Calculate the number of non empty cells when the name of a column contains 'XXX'

Question

I have 59 columns whose name is in the format: nn: xxxxxx (ttttttt), where tttttt is some name which is repeated for some particular columns. Now I want to calculate the sum of non-empty cells when tttttt='XXXXXX'. I know how to calculate the number of non-empty cells in a column, but how do I add the condition of ttttt being XXXXXX in the name of a column?

import pandas as pd
df = pd.read_csv("dane.csv", sep=';')
shape = list(df.shape)
nonempty=df.apply(lambda x: shape[0]-x.isnull().sum())

Input:

1: Brandenburg (Post-Panamax)               2: Acheron (Feeder)                        5: Fenton (Feeder)
ES-NL-10633096/1938/[email protected]/6749   DE-JP-20438082/2066/[email protected]/68849 NL-LK-02275406/2136/[email protected]/73198
BE-BR-61613986/3551/[email protected]/39927         NL-LK-02275406/2136/[email protected]/73198
PH-SA-39552610/2436/[email protected]/80578
PA-AE-59814691/4881/[email protected]/25247  OM-PH-31303222/3671/[email protected]/52408

So for instance for this input, lets say I want to calculate the number of non empty cells for the name in the column 'Feeder'

Could you share some of the data from the CSV, so that we can see the headers and execute your code? — AMC, Oct 29 '19 at 19:06
are you nulls properly defined as `NaN` or just a string with spaces? you might need to convert them first — Umar.H, Oct 29 '19 at 19:07
@AlexanderCécile the headers are as in the input example I included. and the nulls are defined fine — ryszard eggink, Oct 29 '19 at 19:20

score 2 · Accepted Answer · answered Oct 29 '19 at 19:05

You can use filter:

df.filter(like='(Feeder)').isna().sum()

or a more precise version, which requires (Feeder) to appear at the end of the column:

df.filter(regex='.*(\(Feeder\))$').isna().sum()

Output:

2: Acheron (Feeder)    1
5: Fenton (Feeder)     3
dtype: int64

Calculate the number of non empty cells when the name of a column contains 'XXX'

1 Answers1