# r - Split 2 separate data frames, apply functions simultaneously to both then combine

1.4k views

### r - Split 2 separate data frames, apply functions simultaneously to both then combine

I have 2 data frames:

df1 has a list of people who received vouchers at various weeks for a year. It shows which week each customer received their voucher. df2 has daily transactions for the people in df1.

Each data frame has millions of rows

I would like to: 1- Split df1 by week, resulting in 52 data frames (df1.1, df1.2, df1.3.....df1.52) 2- for each of the 52 data frames I would like to do the following:

``````df2[df2\$customer_ID %in% df1.1\$customer_ID, ] %>%
group_by(week_num) %>%
summarise(tot_sales = sum(sales)
``````

Each time the loop creates a data frame containing one row, i.e a single week.

So the resulting data frame (df3) will have 52 rows.

So far I have the following:

``````datalist <- list()

df1_split <- split(df1, df1\$week_number)

for (i in seq_along(df1_split)){

for (j in df2\$week_number){

df2[df2\$customer_ID %in% df1_split[[i]]\$customer_ID, ] %>%
summarise(tot_sales = sum(sales))

datalist[[i]] <- dat

{
{

df3 <- bind_rows(datalist)
``````

But this just runs continuously. What am I doing wrong?

by (71.8m points)

Inside the nested `for` loop, subset the `df1_split[[i]]` instead of the original data

``````datalist <- list()

df1_split <- split(df1, df1\$week_number)

for (i in seq_along(df1_split)){

for (j in df2\$week_number){

tmp <- df2[df2\$customer_ID %in% df1_split[[i]]\$customer_ID, ] %>%