Java 类名:com.alibaba.alink.operator.stream.evaluation.EvalMultiClassStreamOp
Python 类名:EvalMultiClassStreamOp
对多分类算法的预测结果进行效果评估。
在多分类问题的评估中,每条样本都有一个真实的标签和一个由模型生成的预测。
但与二分类问题不同,多分类算法中,总的类别数是大于2的,因此不能直接称作正类和负类。
在计算评估指标时,可以将某个类别选定为正类,将其他值都看作负类,这样可以计算每个类别(per-class)的指标。
进一步地,将每个类别各自的指标进行平均,可以得到模型总体的指标。
这里的“平均”有三种做法:
所支持的每类别指标与平均指标见下:
$Precision = \frac{TP}{TP + FP}$
$Recall = \frac{TP}{TP + FN} = Sensitivity$
$F1=\frac{2TP}{2TP+FP+FN}=\frac{2\cdot Precision \cdot Recall}{Precision+Recall}$
$Accuracy=\frac{TP + TN}{TP + TN + FP + FN}$
$Specificity=\frac{TN}{FP+TN}$
$p_a =\frac{TP + TN}{TP + TN + FP + FN}$
$p_e = \frac{(TN + FP) * (TN + FN) + (FN + TP) * (FP + TP)}{(TP + TN + FP + FN) * (TP + TN + FP + FN)}$
$kappa = \frac{p_a - p_e}{1 - p_e}$
二分类模型除了给出每条样本$i$的预测标签之外,通常还会给出每条样本预测为为各个类别$j$的概率$p_{i,j}$。
通常情况下,每条样本最大概率对应的类别为该样本的预测标签。
$LogLoss=- \frac{1}{n}\sum_{i} \sum_{j=1}^M y_{i,j}log(p_{i,j})$
该组件通常接多分类预测算法的输出端。
使用时,需要通过参数 labelCol
指定预测标签列,通过参数 predictionCol
和 predictionDetailCol
指定预测结果列和预测详细信息列(包含有预测概率)。
名称 | 中文名称 | 描述 | 类型 | 是否必须? | 取值范围 | 默认值 |
---|---|---|---|---|---|---|
labelCol | 标签列名 | 输入表中的标签列名 | String | ✓ | ||
predictionCol | 预测结果列名 | 预测结果列名 | String | |||
predictionDetailCol | 预测详细信息列名 | 预测详细信息列名 | String | |||
timeInterval | 时间间隔 | 流式数据统计的时间间隔 | Double | 3.0 |
from pyalink.alink import * import pandas as pd useLocalEnv(1) df = pd.DataFrame([ ["prefix1", "{\"prefix1\": 0.9, \"prefix0\": 0.1}"], ["prefix1", "{\"prefix1\": 0.8, \"prefix0\": 0.2}"], ["prefix1", "{\"prefix1\": 0.7, \"prefix0\": 0.3}"], ["prefix0", "{\"prefix1\": 0.75, \"prefix0\": 0.25}"], ["prefix0", "{\"prefix1\": 0.6, \"prefix0\": 0.4}"] ]) inOp = StreamOperator.fromDataframe(df, schemaStr='label string, detailInput string') EvalMultiClassStreamOp().setLabelCol("label").setPredictionDetailCol("detailInput").setTimeInterval(0.001).linkFrom(inOp).print() StreamOperator.execute()
import org.apache.flink.types.Row; import com.alibaba.alink.operator.stream.StreamOperator; import com.alibaba.alink.operator.stream.evaluation.EvalMultiClassStreamOp; import com.alibaba.alink.operator.stream.source.MemSourceStreamOp; import org.junit.Test; import java.util.Arrays; import java.util.List; public class EvalMultiClassStreamOpTest { @Test public void testEvalMultiClassStreamOp() throws Exception { List <Row> df = Arrays.asList( Row.of("prefix1", "{\"prefix1\": 0.9, \"prefix0\": 0.1}"), Row.of("prefix1", "{\"prefix1\": 0.8, \"prefix0\": 0.2}"), Row.of("prefix1", "{\"prefix1\": 0.7, \"prefix0\": 0.3}"), Row.of("prefix0", "{\"prefix1\": 0.75, \"prefix0\": 0.25}"), Row.of("prefix0", "{\"prefix1\": 0.6, \"prefix0\": 0.4}") ); StreamOperator <?> inOp = new MemSourceStreamOp(df, "label string, detailInput string"); new EvalMultiClassStreamOp().setLabelCol("label").setPredictionDetailCol("detailInput").setTimeInterval(0.001) .linkFrom(inOp).print(); StreamOperator.execute(); } }
Statistics | Data |
---|---|
all | {“SensitivityArray”:“[1.0,0.0,1.0,0.5,1.0]”,“ConfusionMatrix”:“1,0],[0,0”,“MacroRecall”:“0.5”,“MacroSpecificity”:“0.5”,“FalsePositiveRateArray”:“[0.0,0.0,0.0,0.0,0.0]”,“TruePositiveRateArray”:“[1.0,0.0,1.0,0.5,1.0]”,“MacroAccuracy”:“1.0”,“RecallArray”:“[1.0,0.0,1.0,0.5,1.0]”,“KappaArray”:“[1.0,1.0,1.0,1.0,1.0]”,“MicroFalseNegativeRate”:“0.0”,“WeightedRecall”:“1.0”,“WeightedPrecision”:“1.0”,“MacroPrecision”:“1.0”,“ActualLabelFrequency”:“[1,0]”,“PrecisionArray”:“[1.0,1.0,1.0,1.0,1.0]”,“MicroTruePositiveRate”:“1.0”,“MacroKappa”:“1.0”,“MicroSpecificity”:“1.0”,“PredictLabelProportion”:“[1.0,0.0]”,“F1Array”:“[1.0,0.0,1.0,0.5,1.0]”,“MacroFalsePositiveRate”:“0.0”,“FalseNegativeRateArray”:“[0.0,0.0,0.0,0.0,0.0]”,“TrueNegativeRateArray”:“[0.0,1.0,0.0,0.5,1.0]”,“MacroF1”:“0.5”,“LabelArray”:“["prefix1","prefix0"]”,“WeightedTruePositiveRate”:“1.0”,“WeightedKappa”:“1.0”,“TotalSamples”:“1”,“MicroTrueNegativeRate”:“1.0”,“MacroTruePositiveRate”:“0.5”,“MicroSensitivity”:“1.0”,“WeightedAccuracy”:“1.0”,“AccuracyArray”:“[1.0,1.0,1.0,1.0,1.0]”,“Accuracy”:“1.0”,“WeightedFalseNegativeRate”:“0.0”,“MicroF1”:“1.0”,“WeightedSpecificity”:“0.0”,“WeightedF1”:“1.0”,“MicroAccuracy”:“1.0”,“WeightedTrueNegativeRate”:“0.0”,“Kappa”:“1.0”,“MacroSensitivity”:“0.5”,“SpecificityArray”:“[0.0,1.0,0.0,0.5,1.0]”,“ActualLabelProportion”:“[1.0,0.0]”,“PredictLabelFrequency”:“[1,0]”,“WeightedSensitivity”:“1.0”,“MicroRecall”:“1.0”,“MacroFalseNegativeRate”:“0.0”,“LogLoss”:“0.35667494393873245”,“MicroFalsePositiveRate”:“0.0”,“WeightedFalsePositiveRate”:“0.0”,“MacroTrueNegativeRate”:“0.5”,“MicroPrecision”:“1.0”,“MicroKappa”:“1.0”} |
window | {“SensitivityArray”:“[1.0,0.0,1.0,0.5,1.0]”,“ConfusionMatrix”:“1,0],[0,0”,“MacroRecall”:“0.5”,“MacroSpecificity”:“0.5”,“FalsePositiveRateArray”:“[0.0,0.0,0.0,0.0,0.0]”,“TruePositiveRateArray”:“[1.0,0.0,1.0,0.5,1.0]”,“MacroAccuracy”:“1.0”,“RecallArray”:“[1.0,0.0,1.0,0.5,1.0]”,“KappaArray”:“[1.0,1.0,1.0,1.0,1.0]”,“MicroFalseNegativeRate”:“0.0”,“WeightedRecall”:“1.0”,“WeightedPrecision”:“1.0”,“MacroPrecision”:“1.0”,“ActualLabelFrequency”:“[1,0]”,“PrecisionArray”:“[1.0,1.0,1.0,1.0,1.0]”,“MicroTruePositiveRate”:“1.0”,“MacroKappa”:“1.0”,“MicroSpecificity”:“1.0”,“PredictLabelProportion”:“[1.0,0.0]”,“F1Array”:“[1.0,0.0,1.0,0.5,1.0]”,“MacroFalsePositiveRate”:“0.0”,“FalseNegativeRateArray”:“[0.0,0.0,0.0,0.0,0.0]”,“TrueNegativeRateArray”:“[0.0,1.0,0.0,0.5,1.0]”,“MacroF1”:“0.5”,“LabelArray”:“["prefix1","prefix0"]”,“WeightedTruePositiveRate”:“1.0”,“WeightedKappa”:“1.0”,“TotalSamples”:“1”,“MicroTrueNegativeRate”:“1.0”,“MacroTruePositiveRate”:“0.5”,“MicroSensitivity”:“1.0”,“WeightedAccuracy”:“1.0”,“AccuracyArray”:“[1.0,1.0,1.0,1.0,1.0]”,“Accuracy”:“1.0”,“WeightedFalseNegativeRate”:“0.0”,“MicroF1”:“1.0”,“WeightedSpecificity”:“0.0”,“WeightedF1”:“1.0”,“MicroAccuracy”:“1.0”,“WeightedTrueNegativeRate”:“0.0”,“Kappa”:“1.0”,“MacroSensitivity”:“0.5”,“SpecificityArray”:“[0.0,1.0,0.0,0.5,1.0]”,“ActualLabelProportion”:“[1.0,0.0]”,“PredictLabelFrequency”:“[1,0]”,“WeightedSensitivity”:“1.0”,“MicroRecall”:“1.0”,“MacroFalseNegativeRate”:“0.0”,“LogLoss”:“0.35667494393873245”,“MicroFalsePositiveRate”:“0.0”,“WeightedFalsePositiveRate”:“0.0”,“MacroTrueNegativeRate”:“0.5”,“MicroPrecision”:“1.0”,“MicroKappa”:“1.0”} |