Welcome toVigges Developer Community-Open, Learning,Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
3.7k views
in Technique[技术] by (71.8m points)

python - How to subtract dataframe with month index from dataframe with datetime index

I have two dataframes, one is called Clim and one is called O3_mda8_3135. Clim is a dataframe including monthly average meteorological parameters for one year of data; here is a sample of the dataframe:

Clim.head(12)
Out[7]: 
       avgT_2551  avgT_5330  ...  avgNOx_3135(ppb)  avgCO_3135(ppm)
Month                        ...                                   
1      14.924181  13.545691  ...         48.216128         0.778939
2      16.352172  15.415385  ...         36.110385         0.605629
3      20.530879  19.684720  ...         20.974544         0.460571
4      23.738576  22.919158  ...         14.270995         0.432855
5      26.961927  25.779007  ...         11.087005         0.334505
6      32.208322  31.225072  ...         12.801409         0.384325
7      35.280124  34.265880  ...         10.732970         0.321284
8      35.428857  34.433351  ...         11.916420         0.326389
9      32.008317  30.856782  ...         15.236616         0.343405
10     25.691444  24.139874  ...         24.829518         0.467317
11     19.310550  17.827946  ...         36.339847         0.621938
12     14.186050  12.860077  ...         49.173287         0.720708

[12 rows x 20 columns]

I also have the dataframe O3_mda8_3135, which was created by first calculating the rolling 8 hour average of each component, then finding the maximum daily value of ozone, which is why all of the timestamps and indices are different. There is one value for each meteorological parameter every day of the year. Here's a sample of this dataframe:

 O3_mda8_3135
Out[9]: 
                    date  Temp_C_2551  ...  CO_3135(ppm)  O3_mda8_3135
12   2018-01-01 12:00:00         24.1  ...         0.294       10.4000
36   2018-01-02 12:00:00         26.3  ...         0.202        9.4375
60   2018-01-03 12:00:00         22.8  ...         0.184        7.1625
84   2018-01-04 12:00:00         25.6  ...         0.078        8.2500
109  2018-01-05 13:00:00         27.3  ...           NaN        9.4500
                 ...          ...  ...           ...           ...
8653 2018-12-27 13:00:00         19.6  ...         0.115       35.1125
8676 2018-12-28 12:00:00         14.9  ...         0.097       39.4500
8700 2018-12-29 12:00:00         13.9  ...         0.092       38.1250
8724 2018-12-30 12:00:00         17.4  ...         0.186       35.1375
8753 2018-12-31 17:00:00          8.3  ...         0.110       30.8875

[365 rows x 24 columns]

I am wondering how to subtract the average values in Clim from the corresponding columns and rows in O3_mda8_3135. For example, I would like to subtract the average value for temperature at site 2551 in January (avgT_2551 Month 1 in the Clim dataframe) from every day in January in the other dataframe O3_mda8_3135, column name Temp_C_2551.

avgT_2551 corresponds to Temp_C_2551 in the other dataframe

Is there a simple way to do this? Should I extract the month from the datetime and put it into another column for the O3_mda8_3135 dataframe? I am still a beginner and would appreciate any advice or tips.

I saw this post How to subtract the mean of a month from each day in that month? but there was not enough information given for me to understand what actions were being performed.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

I figured it out on my own, thanks to Stack Overflow posts :)

I created new columns in both dataframes corresponding to the month. I had originally set the index in Clim to the Month using Clim = Clim.set_index('Month') so I removed that line. Then, I created a column for Month in the O3_mda8_3135 dataframe. After that, I merged the two dataframes based on the 'Month' column, then used the pd.sub function to subtract the columns I desired.

Here's some example code, sorry the variables are so long but this dataframe is huge.

O3_mda8_3135['Month'] = O3_mda8_3135['date'].dt.month
O3_mda8_3135_anom = pd.merge(O3_mda8_3135, Clim, how='left', on=('Month'))
O3_mda8_3135_anom['O3_mda8_3135_anom'] = O3_mda8_3135_anom['O3_mda8_3135'].sub(O3_mda8_3135_anom['MDA8_3135'])

These posts helped me answer my question:


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to Vigges Developer Community for programmer and developer-Open, Learning and Share
...