如何修复：输入包含 nan、无穷大或对于 dtype 来说太大的值（“float64”）

经过本杰明·安德森博 7月 16, 2023 指导 0 条评论

使用 Python 时可能遇到的一个常见错误是：

 ValueError: Input contains infinity or a value too large for dtype('float64').

当您尝试使用 scikit-learn 模块中的函数，但用作输入的 DataFrame 或矩阵具有 NaN 值或无限值时，通常会发生此错误。

以下示例展示了如何在实践中解决此错误。

如何重现错误

假设我们有以下 pandas DataFrame：

 import pandas as pd
import numpy as np

#createDataFrame
df = pd. DataFrame ({' x1 ': [1, 2, 2, 4, 2, 1, 5, 4, 2, 4, 4],
                   ' x2 ': [1, 3, 3, 5, 2, 2, 1, np.inf, 0, 3, 4],
                   ' y ': [np.nan, 78, 85, 88, 72, 69, 94, 94, 88, 92, 90]})

#view DataFrame
print (df)

    x1 x2 y
0 1 1.0 NaN
1 2 3.0 78.0
2 2 3.0 85.0
3 4 5.0 88.0
4 2 2.0 72.0
5 1 2.0 69.0
6 5 1.0 94.0
7 4 lower 94.0
8 2 0.0 88.0
9 4 3.0 92.0
10 4 4.0 90.0

现在假设我们尝试使用scikit-learn函数拟合多元线性回归模型：

 from sklearn. linear_model import LinearRegression

#initiate linear regression model
model = LinearRegression()

#define predictor and response variables
x, y = df[[' x1 ', ' x2 ']], df. y

#fit regression model
model. fit (x,y)

#print model intercept and coefficients
print (model. intercept_ , model. coef_ )

ValueError: Input contains infinity or a value too large for dtype('float64').

我们收到错误，因为我们使用的 DataFrame 具有无限值和 NaN 值。

如何修复错误

解决此错误的方法是首先从 DataFrame 中删除包含无限或 NaN 值的所有行：

 #remove rows with any values that are not finite
df_new = df[np. isfinite (df). all ( 1 )]

#view updated DataFrame
print (df_new)

    x1 x2 y
1 2 3.0 78.0
2 2 3.0 85.0
3 4 5.0 88.0
4 2 2.0 72.0
5 1 2.0 69.0
6 5 1.0 94.0
8 2 0.0 88.0
9 4 3.0 92.0
10 4 4.0 90.0

具有无限或 NaN 值的两条线已被删除。

我们现在可以继续拟合我们的线性回归模型：

 from sklearn. linear_model import LinearRegression

#initiate linear regression model
model = LinearRegression()

#define predictor and response variables
x, y = df_new[[' x1 ', ' x2 ']], df_new. y

#fit regression model
model. fit (x,y)

#print model intercept and coefficients
print (model. intercept_ , model. coef_ )

69.85144124168515 [5.72727273 -0.93791574]

请注意，这次我们没有收到任何错误，因为我们首先从 DataFrame 中删除了具有无限或 NaN 值的行。

其他资源

以下教程解释了如何修复 Python 中的其他常见错误：

如何在 Python 中修复：对象“numpy.ndarray”不可调用
 如何修复：类型错误：对象“numpy.float64”不可调用
 如何修复：类型错误：预期字符串或字节对象

关于作者

本杰明·安德森博

大家好，我是本杰明，一位退休的统计学教授，后来成为 Statorials 的热心教师。凭借在统计领域的丰富经验和专业知识，我渴望分享我的知识，通过 Statorials 增强学生的能力。了解更多

如何重现错误

如何修复错误

其他资源

关于作者

本杰明·安德森博

添加评论