Pandas：如何使用 groupby 进行多个聚合

经过本杰明·安德森博 7月 16, 2023 指导 0 条评论

您可以使用以下基本语法在 pandas 中使用具有多个聚合的 groupby：

 df. groupby (' team '). agg (
    mean_points=(' points ', np. mean ),
    sum_points=(' points ', np. sum ),
    std_points=(' points ', np. std ))

这个特定的公式通过名为team的变量对 DataFrame 的行进行分组，然后计算名为point的变量的几个汇总统计数据。

以下示例展示了如何在实践中使用此语法。

示例：在 Pandas 中使用 Groupby 进行多重聚合

假设我们有以下 pandas DataFrame，其中包含有关各种篮球运动员的信息：

 import pandas as pd

#createDataFrame
df = pd. DataFrame ({' team ': ['Mavs', 'Mavs', 'Mavs', 'Heat', 'Heat', 'Heat'],
                   ' points ': [18, 22, 19, 14, 14, 11],
                   ' assists ': [5, 7, 7, 9, 12, 9]})

#view DataFrame
print (df)

   team points assists
0 Mavs 18 5
1 Mavs 22 7
2 Mavs 19 7
3 Heat 14 9
4 Heat 14 12
5 Heat 11 9

我们可以使用以下语法将 DataFrame 的行按team进行分组，然后计算每个团队的平均值、总和和标准差：

 import numpy as np

#group by team and calculate mean, sum, and standard deviation of points
df. groupby (' team '). agg (
    mean_points=(' points ', np. mean ),
    sum_points=(' points ', np. sum ),
    std_points=(' points ', np. std ))

      mean_points sum_points std_points
team			
Heat 13.000000 39 1.732051
Mavs 19.666667 59 2.081666

结果显示每个团队的得分变量的平均值、总和和标准差。

您可以使用类似的语法来执行 groupby 并计算任意数量的聚合。

其他资源

以下教程解释了如何执行其他常见的 panda 任务：

如何使用 Pandas GroupBy 计算唯一值
 如何将函数应用于 Pandas Groupby
如何从 Pandas GroupBy 创建条形图

关于作者

本杰明·安德森博

大家好，我是本杰明，一位退休的统计学教授，后来成为 Statorials 的热心教师。凭借在统计领域的丰富经验和专业知识，我渴望分享我的知识，通过 Statorials 增强学生的能力。了解更多

示例：在 Pandas 中使用 Groupby 进行多重聚合

其他资源

关于作者

本杰明·安德森博

添加评论