如何在 r 中修复：因子水平无效，生成 na

经过本杰明·安德森博 7月 22, 2023 指导 0 条评论

使用 R 时可能会遇到的警告消息是：

 Warning message:
In `[<-.factor`(`*tmp*`, iseq, value = "C"):
  invalid factor level, NA generated

当您尝试向 R 中尚未作为定义水平存在的因子变量添加值时，会出现此警告。

以下示例展示了如何在实践中响应此警告。

如何重现警告

假设我们在 R 中有以下数据框：

 #create data frame
df <- data. frame (team=factor(c('A', 'A', 'B', 'B', 'B')),
                 dots=c(99, 90, 86, 88, 95))

#view data frame
df

  team points
1 to 99
2 to 90
3 B 86
4 B 88
5 B 95

#view structure of data frame
str(df)

'data.frame': 5 obs. of 2 variables:
 $ team: Factor w/ 2 levels "A","B": 1 1 2 2 2
 $ points: num 99 90 86 88 95

我们看到团队变量是一个有两个水平的因素：“A”和“B”

现在假设我们尝试使用team的值“C”在数据框的末尾添加一个新行：

 #add new row to end of data frame
df[nrow(df) + 1,] = c('C', 100)

Warning message:
In `[<-.factor`(`*tmp*`, iseq, value = "C"):
  invalid factor level, NA generated

我们收到一条警告消息，因为值“C”尚不存在作为团队变量的因子水平。

需要注意的是，这只是一条警告消息，R 仍会将换行符添加到数据框的末尾，但它将使用NA值而不是“C”：

 #view updated data frame
df

  team points
1 to 99
2 to 90
3 B 86
4 B 88
5 B 95
6 NA 100

如何避免警告

为了避免无效因子级别警告，我们需要先将因子变量转换为字符变量，然后在添加新行后将其转换回因子变量：

 #convert team variable to character
df$team <- as. character (df$team)

#add new row to end of data frame
df[nrow(df) + 1,] = c('C', 100)

#convert team variable back to factor
df$team <- as. factor (df$team)

#view updated data frame
df

  team points
1 to 99
2 to 90
3 B 86
4 B 88
5 B 95
6 C 100

请注意，我们成功地将换行符添加到数据框的末尾并避免出现警告消息。

我们还可以验证值“C”是否已作为因子水平添加到团队变量中：

 #view structure of updated data frame
str(df)

'data.frame': 6 obs. of 2 variables:
 $ team: Factor w/ 3 levels "A","B","C": 1 1 2 2 2 3
 $points: chr "99" "90" "86" "88" ...

其他资源

以下教程解释了如何修复 R 中的其他常见错误：

如何在 R 中修复：参数涉及不同的行数
 如何修复 R：选择未使用的参数时出错
 如何在 R 中修复：替换长度为零

关于作者

本杰明·安德森博

大家好，我是本杰明，一位退休的统计学教授，后来成为 Statorials 的热心教师。凭借在统计领域的丰富经验和专业知识，我渴望分享我的知识，通过 Statorials 增强学生的能力。了解更多

如何重现警告

如何避免警告

其他资源

关于作者

本杰明·安德森博

添加评论