如何在 python 中创建精确召回曲线

经过本杰明·安德森博 7月 23, 2023 指导 0 条评论

在机器学习中使用分类模型时，我们经常用来评估模型质量的两个指标是精确率和召回率。

准确性：相对于总阳性预测的正确阳性预测。

计算如下：

准确率 = 真阳性 /（真阳性 + 假阳性）

提醒：根据实际阳性总数纠正阳性预测

计算如下：

提醒 = 真阳性 /（真阳性 + 假阴性）

为了可视化某个模型的精确率和召回率，我们可以创建一条精确率-召回率曲线。该曲线显示了不同阈值下精度和召回率之间的权衡。

Python 中的精确召回曲线

以下分步示例展示了如何在 Python 中为逻辑回归模型创建精度召回曲线。

第1步：导入包

首先，我们将导入必要的包：

 from sklearn import datasets
from sklearn. model_selection import train_test_split
from sklearn. linear_model import LogisticRegression
from sklearn. metrics import precision_recall_curve
import matplotlib. pyplot as plt

步骤 2：拟合逻辑回归模型

接下来，我们将创建一个数据集并为其拟合逻辑回归模型：

 #create dataset with 5 predictor variables
X, y = datasets. make_classification (n_samples= 1000 ,
                                    n_features= 4 ,
                                    n_informative= 3 ,
                                    n_redundant= 1 ,
                                    random_state= 0 )

#split dataset into training and testing set
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size= .3 , random_state= 0 )

#fit logistic regression model to dataset
classifier = LogisticRegression()
classify. fit (X_train, y_train)

#use logistic regression model to make predictions
y_score = classify. predict_proba (X_test)[:, 1 ]

第 3 步：创建精确率-召回率曲线

接下来，我们将计算模型的精确率和召回率，并创建精确率-召回率曲线：

 #calculate precision and recall
precision, recall, thresholds = precision_recall_curve(y_test, y_score)

#create precision recall curve
fig, ax = plt. subplots ()
ax. plot (recall, precision, color=' purple ')

#add axis labels to plot
ax. set_title (' Precision-Recall Curve ')
ax. set_ylabel (' Precision ')
ax. set_xlabel (' Recall ')

#displayplot
plt. show ()

Python 中的精确召回曲线

x 轴显示召回率，y 轴显示不同阈值的精度。

请注意，随着召回率的增加，精确度会降低。

这代表了两个指标之间的折衷。为了提高模型的召回率，精度必须降低，反之亦然。

其他资源

如何在 Python 中执行逻辑回归
 如何在 Python 中创建混淆矩阵
 如何解释 ROC 曲线（附示例）

关于作者

本杰明·安德森博

大家好，我是本杰明，一位退休的统计学教授，后来成为 Statorials 的热心教师。凭借在统计领域的丰富经验和专业知识，我渴望分享我的知识，通过 Statorials 增强学生的能力。了解更多