pandas DataFrame 根据其他列新建列并赋值

主要是 DataFrame.apply 函数的应用

如果设置 axis 参数为 0 则每次函数会取出 DataFrame 的一行来做处理;

如果设置 axis 参数为 1 则每次函数会取出 DataFrame 的一列来做处理。

如代码所示,判断数学和英语均大于 75 分的同学,则新列 test 值赋为 1,否则为 0。

import pandas as pd

data = {
    'name': ["one", "two", "three", "four", "five", "six", "seven"],
    'math': [99, 65, 78, 43, 88, 75, 36],
    'English': [85, 74, 92, 76, 86, 36, 72]
}
frame = pd.DataFrame(data, columns=['name', 'math', 'English'])


# 查找数学和英语均大于75分的同学
def function(a, b):
    if a > 75 and b > 75:
        return 1
    else:
        return 0


print(frame)
# 两种格式都可以
# frame['test'] = frame.apply(lambda x: function(x.math, x.English), axis=1)
frame['test'] = frame.apply(lambda x: function(x["math"], x["English"]), axis=1)
print(frame)

运行结果如下:

    name  math  English
0    one    99       85
1    two    65       74
2  three    78       92
3   four    43       76
4   five    88       86
5    six    75       36
6  seven    36       72
    name  math  English  test
0    one    99       85     1
1    two    65       74     0
2  three    78       92     1
3   four    43       76     0
4   five    88       86     1
5    six    75       36     0
6  seven    36       72     0

另外 Series 类型也有 apply 函数,用法示例如下:

import pandas as pd

data = {
    'name': ["one", "two", "three", "four", "five", "six", "seven"],
    'math': [99, 65, 78, 43, 88, 75, 36],
    'English': [85, 74, 92, 76, 86, 36, 72]
}
frame = pd.DataFrame(data, columns=['name', 'math', 'English'])

print(frame)
# 判断数学成绩是否及格
frame['test'] = frame.math.apply(lambda x: 1 if x >= 60 else 0)
print(frame)

运行效果如下:

    name  math  English
0    one    99       85
1    two    65       74
2  three    78       92
3   four    43       76
4   five    88       86
5    six    75       36
6  seven    36       72 

    name  math  English  test
0    one    99       85     1
1    two    65       74     1
2  three    78       92     1
3   four    43       76     0
4   five    88       86     1
5    six    75       36     1
6  seven    36       72     0