Pandas：如何合并具有相同名称的列

经过本杰明·安德森博 7月 14, 2023 指导 0 条评论

您可以使用以下基本语法来合并 pandas DataFrame 中共享相同列名的列：

 #define function to merge columns with same names together
def same_merge (x): return ' , '. join (x[ x.notnull ()]. astype (str))

#define new DataFrame that merges columns with same names together
df_new = df. groupby (level= 0 , axis= 1 ). apply ( lambda x: x.apply (same_merge,axis= 1 ))

以下示例展示了如何在实践中使用此语法。

示例：合并 Pandas 中共享相同名称的列

假设我们有以下 pandas DataFrame：

 import pandas as pd
import numpy as np

#createDataFrame
df = pd. DataFrame ({' A ': [5, 6, 8, np.nan, 4, np.nan, np.nan],
                   ' A1 ': [np.nan, 12, np.nan, 10, np.nan, 6, 4],
                   ' B ': [2, 7, np.nan, np.nan, 2, 4, np.nan],
                   ' B1 ': [5, np.nan, 6, 15, 1, np.nan, 4]})

#rename columns so there are duplicate column names
df. columns = [' A ', ' A ', ' B ', ' B ']

#view DataFrame
print (df)

     AABB
0 5.0 NaN 2.0 5.0
1 6.0 12.0 7.0 NaN
2 8.0 NaN NaN 6.0
3 NaN 10.0 NaN 15.0
4 4.0 NaN 2.0 1.0
5 NaN 6.0 4.0 NaN
6 NaN 4.0 NaN 4.0

请注意，两列被命名为“A”，两列被命名为“B”。

我们可以使用下面的代码来合并具有相同列名的列，并用逗号将它们的值连接起来：

 #define function to merge columns with same names together
def same_merge (x): return ' , '. join (x[ x.notnull ()]. astype (str))

#define new DataFrame that merges columns with same names together
df_new = df. groupby (level= 0 , axis= 1 ). apply ( lambda x: x.apply (same_merge,axis= 1 ))

#view new DataFrame
print (df_new)

          AB
0 5.0 2.0,5.0
1 6.0,12.0 7.0
2 8.0 6.0
3 10.0 15.0
4 4.0 2.0,1.0
5 6.0 4.0
6 4.0 4.0

新的DataFrame合并了具有相同名称的列，并用逗号连接它们的值。

如果您想使用不同的分隔符，只需在same_merge()函数中将逗号分隔符替换为其他分隔符即可。

例如，以下代码显示如何使用分号分隔符：

 #define function to merge columns with same names together
def same_merge (x): return ' ; '. join (x[ x.notnull ()]. astype (str))

#define new DataFrame that merges columns with same names together
df_new = df. groupby (level= 0 , axis= 1 ). apply ( lambda x: x.apply (same_merge,axis= 1 ))

#view new DataFrame
print (df_new)

          AB
0 5.0 2.0;5.0
1 6.0;12.0 7.0
2 8.0 6.0
3 10.0 15.0
4 4.0 2.0;1.0
5 6.0 4.0
6 4.0 4.0

新的DataFrame合并了具有相同名称的列，并用分号连接它们的值。

其他资源

以下教程解释了如何在 pandas 中执行其他常见操作：

如何删除 Pandas 中的重复列
 如何列出 Pandas 中的所有列名称
 如何在 Pandas 中按名称对列进行排序

关于作者

本杰明·安德森博

大家好，我是本杰明，一位退休的统计学教授，后来成为 Statorials 的热心教师。凭借在统计领域的丰富经验和专业知识，我渴望分享我的知识，通过 Statorials 增强学生的能力。了解更多

示例：合并 Pandas 中共享相同名称的列

其他资源

关于作者

本杰明·安德森博

添加评论