如何在 python 中计算 r 平方（附示例）

经过本杰明·安德森博 7月 19, 2023 指导 0 条评论

R 平方（通常写为^R2 ）是响应变量中可以由线性回归模型中的预测变量解释的方差的比例。

R 平方的值可以在 0 到 1 之间变化，其中：

0表示响应变量根本无法用预测变量来解释。
1表明响应变量可以被预测变量完美地解释而没有错误。

以下示例演示如何在 Python 中计算回归模型的 R ² 。

示例：在 Python 中计算 R 平方

假设我们有以下 pandas DataFrame：

 import pandas as pd

#createDataFrame
df = pd. DataFrame ({' hours ': [1, 2, 2, 4, 2, 1, 5, 4, 2, 4, 4, 3, 6],
                   ' prep_exams ': [1, 3, 3, 5, 2, 2, 1, 1, 0, 3, 4, 3, 2],
                   ' score ': [76, 78, 85, 88, 72, 69, 94, 94, 88, 92, 90, 75, 96]})

#view DataFrame
print (df)

    hours prep_exams score
0 1 1 76
1 2 3 78
2 2 3 85
3 4 5 88
4 2 2 72
5 1 2 69
6 5 1 94
7 4 1 94
8 2 0 88
9 4 3 92
10 4 4 90
11 3 3 75
12 6 2 96

我们可以使用sklearn的LinearRegression()函数来拟合回归模型，并使用Score()函数来计算模型的 R 平方值：

 from sklearn.linear_model import LinearRegression

#initiate linear regression model
model = LinearRegression()

#define predictor and response variables
x, y = df[[" hours ", " prep_exams "]], df. score

#fit regression model
model. fit (x,y)

#calculate R-squared of regression model
r_squared = model. score (x,y)

#view R-squared value
print (r_squared)

0.7175541714105901

模型的 R 方结果为0.7176 。

这意味着71.76%的考试成绩差异可以通过学习时数和练习考试次数来解释。

如果需要，我们可以将该 R 平方值与具有不同预测变量集的另一个回归模型进行比较。

一般来说，R平方值较高的模型是首选，因为这意味着模型中的预测变量集能够很好地解释响应变量的变化。

相关： 什么是好的 R 平方值？

其他资源

以下教程解释了如何在 Python 中执行其他常见操作：

如何在 Python 中执行简单线性回归
 如何在 Python 中执行多元线性回归
 如何用Python计算回归模型的AIC

关于作者

本杰明·安德森博

大家好，我是本杰明，一位退休的统计学教授，后来成为 Statorials 的热心教师。凭借在统计领域的丰富经验和专业知识，我渴望分享我的知识，通过 Statorials 增强学生的能力。了解更多

示例：在 Python 中计算 R 平方

其他资源

关于作者

本杰明·安德森博

添加评论