Welcome toVigges Developer Community-Open, Learning,Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
213 views
in Technique[技术] by (71.8m points)

python - Trying to use apply to make operations over different keys in a Panda DataFrame

I have a Panda DataFrame that looks somewhat like this:

df = pd.DataFrame({'ID' : ['O60829','O60341','Q9H1R3'], 'TOTAL_COVERAGE' : ['yes','yes','no'], 'BEG_D' : ['1','1','500'], 'END_D' : ['102','25','600'], 'BEG_S' : ['1','1','1'], 'END_S': ['102','25','458']})

And I want to iter over every row, check the value of 'TOTAL_COVERAGE' and if it's 'yes', perform a mathematical operation over the other values, ie:

for index, row in df.iterrows():
    df['%']  = df.apply(lambda x : ((int(x['END_S'])*100)/int(x['END_D'])) if x['TOTAL_COVERAGE'] == 'yes' else '')

But I'm getting the error: KeyError: 'TOTAL_COVERAGE' There must be an easy fix that I'm not seeing. Thanks in advance!


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Your can solve it in a vectorized approach, no need for iterrows and apply:

df['%'] = (df['END_S'].astype(int) * 100 / df['END_D'].astype(int)) 
            .where(df['TOTAL_COVERAGE'] == 'yes')

df

#       ID TOTAL_COVERAGE BEG_D END_D BEG_S END_S      %
#0  O60829            yes     1   102     1   102  100.0
#1  O60341            yes     1    25     1    25  100.0
#2  Q9H1R3             no   500   600     1   458    NaN

The reason you are getting a keyError is because when you are using apply, the argument to lambda x is a column (pandas Series), which can't be used to access a specific column by it's name.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to Vigges Developer Community for programmer and developer-Open, Learning and Share
...