What you can do so far is to define the maximum number of iterations to loop on. So far you have 5 columns to loop on.
list_columns = ['col1', 'col2_1', 'col2_2', 'col2_3', 'col3']
print(len(list_columns)) # returns 5
Then, you can define your column names based on what you want to put in your dataframe. Suppose you have 5 iterations to make. Your column names would be ['A', 'B', 'C', 'D', 'E']
. This is the column
argument of your dataframe. An easier way to concatenate several columns at once is to create a dictionary first, with each column name being the key and each of them having a list the same size as a value.
list_columns = ['col1', 'col2_1', 'col2_2', 'col2_3', 'col3']
new_columns = ['A', 'B', 'C', 'D', 'E']
# Use a dictionary comprehension in my case
data_dict = {column: [] for column in new_columns}
n = 50 # Assume the number of loops is arbitrary there
for i in range(n):
for col in new_columns:
# do something
data_dict[col].append(something)
In your case it looks like you can directly operate on the lists by providing a NumPy array instead. Therefore:
list_cols = ['col1', 'col2_1', 'col2_2', 'col2_3', 'col3']
new_cols = ['A', 'B', 'C', 'D', 'E']
data_df = {}
for i, (col, new_col) in enumerate(zip(list_cols, new_cols)):
print(col, list_cols[0:i] + list_cols[i+1:])
temp_df = df[[col] + list_cols[0:i] + list_cols[i+1:]]
temp_indices = np.argmax(temp_df.ne(0).values, axis=1)
data_df[new_col] = b.values[np.arange(len(temp_df)), temp_indices]
final_df = pd.DataFrame(data_df)
What I basically did was a double unpacking combining enumerate
to get the index and zip
to get your final result. The columns are there selected and placed before the rest of the list in no particular order.