Welcome toVigges Developer Community-Open, Learning,Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
191 views
in Technique[技术] by (71.8m points)

python - Loop where the condition for the first entry in a column is different from the remaining column entries?

I have a dataframe named Exam that looks like

Col A      Col B      Col C     Col D     Col E     Col F
  1          1         Jan       2.5       2.5       Yes
  1          2         Jan       2.4       2.5       Yes
  2          3         Jan       2.4       2.5       Yes
  2          4         Feb       2.3       2.4       No
  2          5         Feb       2.5       2.6       No
  3          6         Mar       2.4       2.6       Yes
  3          7         Mar       2.5       2.5       Yes

I want to check the condition of Col F and store it in a new column called Col G but the condition for the first row of the dataframe/Col F is different from the remaining rows in the Col F. I have the following script:

for i in Exam.index:
    def val(df):
        if i == 0:
            if df["Col F"] == "Yes":
                return "In"
            if df["Col F"] == "No":
                return "Out"
        if i != 0:
            if df["Col F"] == "Yes":
                return "In2"
            if df["Col F"] == "No":
                return "Out2"

Exam["Col G"] = Exam.apply(val, axis=1)

Exam

The script returns:

Col A      Col B      Col C     Col D     Col E     Col F     **Col G**
  1          1         Jan       2.5       2.5       Yes       **In2**
  1          2         Jan       2.4       2.5       Yes       **In2**
  2          3         Jan       2.4       2.5       Yes       **In2**
  2          4         Feb       2.3       2.4       No        **Out2**
  2          5         Feb       2.5       2.6       No        **Out2**
  3          6         Mar       2.4       2.6       Yes       **In2**
  3          7         Mar       2.5       2.5       Yes       **In2**

but I want it to return:

Col A      Col B      Col C     Col D     Col E     Col F     **Col G**
  1          1         Jan       2.5       2.5       Yes       **In**
  1          2         Jan       2.4       2.5       Yes       **In2**
  2          3         Jan       2.4       2.5       Yes       **In2**
  2          4         Feb       2.3       2.4       No        **Out2**
  2          5         Feb       2.5       2.6       No        **Out2**
  3          6         Mar       2.4       2.6       Yes       **In2**
  3          7         Mar       2.5       2.5       Yes       **In2**

The loop isn't executing the condition for the first row in Col F. This seems like an easy thing but I'm not sure what I am doing wrong. Thanks!


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

The simplest sulotion would probably be to modify the val function to take i and df['Col F'] as arguments, and use enumerate(df['Col F']) as inputs to val.

def val(i, f):
    if i == 0:
        if f == "Yes":
            return "In"
        if f == "No":
            return "Out"
    if i != 0:
        if f == "Yes":
            return "In2"
        if f == "No":
            return "Out2"

Exam["Col G"] = [val(i, f) for i, f in enumerate(df['Col F'])]

If you need to use more than one column in your calculation, you can use zip:

def val(i, e, f): ...

df['Col G'] = [val(i, e, f) for i, (e, f) in enumerate(zip(df['Col E'], df['Col F']))]

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to Vigges Developer Community for programmer and developer-Open, Learning and Share
...