如何在r中提取特定字符后的字符串

经过本杰明·安德森博 7月 13, 2023 指导 0 条评论

您可以使用以下方法提取 R 中特定字符后面的字符串：

方法1：使用Base R提取特定字符后的字符串

 sub(' .*the ', '', my_string)

方法2：使用stringr提取特定字符后的字符串

 library (stringr)

str_replace(my_string, ' (.*?)the(.*?) ', ' \\1 ')

这两个示例都提取my_string中“the”模式之后的字符串。

以下示例展示了如何在实践中使用以下数据框使用每种方法：

 #create data frame
df <- data. frame (team=c('theMavs', 'theHeat', 'theNets', 'theRockets'),
                 dots=c(114, 135, 119, 140))

#view data frame
df

        team points
1 theMavs 114
2 theHeat 135
3 theNets 119
4 theRockets 140

示例 1：使用 Base R 提取特定字符后的字符串

以下代码显示如何提取数据框team列中每行“the”后面的字符串：

 #create new column that extracts string after "the" in team column
df$team_name <- sub(' .*the ', '', df$team)

#view updated data frame
df

        team points team_name
1 theMavs 114 Mavs
2 theHeat 135 Heat
3 theNets 119 Nets
4 theRockets 140 Rockets

请注意，名为team_name的新列包含数据框中team列中每一行的“the”后面的字符串。

示例 2：使用 stringr 包提取特定字符后的字符串

以下代码展示了如何使用 R 中stringr包中的str_replace()函数提取数据框team列中每行“the”后面的字符串：

 library (stringr)

#create new column that extracts string after "the" in team column
df$team_name <- str_replace(df$team, ' (.*?)the(.*?)', '\\1 ')

#view updated data frame
df

           team points team_name
1 Mavs pro team 114 Mavs
2 team Heat pro 135 Heat
3 Nets pro team 119 Nets

请注意，名为team_name的新列包含数据框中team列中每一行的“the”后面的字符串。

这与在基本 R 中使用sub()函数的结果相匹配。

其他资源

以下教程解释了如何在 R 中执行其他常见任务：

如何在 R 中选择包含特定字符串的列
 如何从R中的字符串中删除字符
 如何在R中查找字符串中的字符位置

关于作者

本杰明·安德森博

大家好，我是本杰明，一位退休的统计学教授，后来成为 Statorials 的热心教师。凭借在统计领域的丰富经验和专业知识，我渴望分享我的知识，通过 Statorials 增强学生的能力。了解更多

示例 1：使用 Base R 提取特定字符后的字符串

示例 2：使用 stringr 包提取特定字符后的字符串

其他资源

关于作者

本杰明·安德森博

添加评论