本节讨论在流式预测和LocalPredictor嵌入式预测场景如何使用模型流,可通过如下三种方式实现:
BatchOperator initModel = new AkSourceBatchOp()
.setFilePath(DATA_DIR + INIT_NUMERIC_LR_MODEL_FILE);
StreamOperator <?> predResult = new CsvSourceStreamOp()
.setFilePath("http://alink-release.oss-cn-beijing.aliyuncs.com/data-files/avazu-ctr-train-8M.csv")
.setSchemaStr(SCHEMA_STRING)
.setIgnoreFirstLine(true)
.link(
new LogisticRegressionPredictStreamOp(initModel)
.setPredictionCol(PREDICTION_COL_NAME)
.setReservedCols(new String[] {LABEL_COL_NAME})
.setPredictionDetailCol(PRED_DETAIL_COL_NAME)
.setModelStreamFilePath(DATA_DIR + FTRL_MODEL_STREAM_DIR)
);
predResult
.sample(0.0001)
.select("'Pred Sample' AS out_type, *")
.print();
predResult
.link(
new EvalBinaryClassStreamOp()
.setLabelCol(LABEL_COL_NAME)
.setPredictionDetailCol(PRED_DETAIL_COL_NAME)
.setTimeInterval(10)
)
.link(
new JsonValueStreamOp()
.setSelectedCol("Data")
.setReservedCols(new String[] {"Statistics"})
.setOutputCols(new String[] {"Accuracy", "AUC", "ConfusionMatrix"})
.setJsonPath(new String[] {"$.Accuracy", "$.AUC", "$.ConfusionMatrix"})
)
.select("'Eval Metric' AS out_type, *")
.print();
StreamOperator.execute();
BatchOperator initModel = new AkSourceBatchOp()
.setFilePath(DATA_DIR + INIT_NUMERIC_LR_MODEL_FILE);
PipelineModel pipelineModel = new PipelineModel(
new LogisticRegressionModel()
.setModelData(initModel)
.setPredictionCol(PREDICTION_COL_NAME)
.setReservedCols(new String[] {LABEL_COL_NAME})
.setPredictionDetailCol(PRED_DETAIL_COL_NAME)
.setModelStreamFilePath(DATA_DIR + FTRL_MODEL_STREAM_DIR)
);
pipelineModel.save(DATA_DIR + LR_PIPELINEMODEL_FILE, true);
BatchOperator.execute();
StreamOperator <?> predResult = pipelineModel
.transform(
new CsvSourceStreamOp()
.setFilePath(
"http://alink-release.oss-cn-beijing.aliyuncs.com/data-files/avazu-ctr-train-8M.csv")
.setSchemaStr(SCHEMA_STRING)
.setIgnoreFirstLine(true)
);
predResult
.sample(0.0001)
.select("'Pred Sample' AS out_type, *")
.print();
predResult
.link(
new EvalBinaryClassStreamOp()
.setLabelCol(LABEL_COL_NAME)
.setPredictionDetailCol(PRED_DETAIL_COL_NAME)
.setTimeInterval(10)
)
.link(
new JsonValueStreamOp()
.setSelectedCol("Data")
.setReservedCols(new String[] {"Statistics"})
.setOutputCols(new String[] {"Accuracy", "AUC", "ConfusionMatrix"})
.setJsonPath(new String[] {"$.Accuracy", "$.AUC", "$.ConfusionMatrix"})
)
.select("'Eval Metric' AS out_type, *")
.print();
StreamOperator.execute();
Object[] input = new Object[] {
"10000949271186029916", "1", "14102100", "1005", 0, "1fbe01fe", "f3845767", "28905ebd",
"ecad2386", "7801e8d9", "07d7df22", "a99f214a", "37e8da74", "5db079b5", "1", "2",
15707, 320, 50, 1722, 0, 35, -1, 79};
LocalPredictor localPredictor
= new LocalPredictor(DATA_DIR + LR_PIPELINEMODEL_FILE, SCHEMA_STRING);
for (int i = 1; i <= 100; i++) {
System.out.print(i + "\t");
System.out.println(ArrayUtils.toString(localPredictor.predict(input)));
Thread.sleep(2000);
}
localPredictor.close();
本代码对应Chap29Pred.c_3_3()方法,运行此方法的同时,运行Chap29.c_2()方法。则此方法打印输出如下信息。由于本方法一直都是预测同样的数据,在模型没有发生变化的时候,预测结果是一样的;预测结果发生变化,也就意味着模型已经更新。从下面的内容看,预测结果发生多次变化,模型流起到了作用。
0 1,0,{"0":"0.8059634797835652","1":"0.1940365202164348"}
1 1,0,{"0":"0.8059634797835652","1":"0.1940365202164348"}
2 1,0,{"0":"0.8059634797835652","1":"0.1940365202164348"}
3 1,0,{"0":"0.8059634797835652","1":"0.1940365202164348"}
4 1,0,{"0":"0.8059634797835652","1":"0.1940365202164348"}
5 1,0,{"0":"0.8059634797835652","1":"0.1940365202164348"}
6 1,0,{"0":"0.8059634797835652","1":"0.1940365202164348"}
7 1,0,{"0":"0.8059634797835652","1":"0.1940365202164348"}
8 1,0,{"0":"0.8059634797835652","1":"0.1940365202164348"}
9 1,0,{"0":"0.8059634797835652","1":"0.1940365202164348"}
10 1,0,{"0":"0.8059634797835652","1":"0.1940365202164348"}
11 1,0,{"0":"0.8059634797835652","1":"0.1940365202164348"}
12 1,0,{"0":"0.8059634797835652","1":"0.1940365202164348"}
13 1,0,{"0":"0.8059634797835652","1":"0.1940365202164348"}
14 1,0,{"0":"0.8059634797835652","1":"0.1940365202164348"}
15 1,0,{"0":"0.8059634797835652","1":"0.1940365202164348"}
16 1,0,{"0":"0.8059634797835652","1":"0.1940365202164348"}
17 1,0,{"0":"0.8059634797835652","1":"0.1940365202164348"}
18 1,0,{"0":"0.8059634797835652","1":"0.1940365202164348"}
19 1,0,{"0":"0.8059634797835652","1":"0.1940365202164348"}
20 1,0,{"0":"0.8059059300425273","1":"0.1940940699574727"}
21 1,0,{"0":"0.8059059300425273","1":"0.1940940699574727"}
22 1,0,{"0":"0.8059059300425273","1":"0.1940940699574727"}
23 1,0,{"0":"0.8059059300425273","1":"0.1940940699574727"}
24 1,0,{"0":"0.8059059300425273","1":"0.1940940699574727"}
25 1,0,{"0":"0.8059057632191885","1":"0.1940942367808115"}
26 1,0,{"0":"0.8059057632191885","1":"0.1940942367808115"}
27 1,0,{"0":"0.8059057632191885","1":"0.1940942367808115"}
28 1,0,{"0":"0.8059057632191885","1":"0.1940942367808115"}
29 1,0,{"0":"0.8059057632191885","1":"0.1940942367808115"}
30 1,0,{"0":"0.8059057632191885","1":"0.1940942367808115"}
31 1,0,{"0":"0.8059057632191885","1":"0.1940942367808115"}
32 1,0,{"0":"0.8059057632191885","1":"0.1940942367808115"}
33 1,0,{"0":"0.8059057632191885","1":"0.1940942367808115"}
34 1,0,{"0":"0.8059057632191885","1":"0.1940942367808115"}
35 1,0,{"0":"0.8068633636443617","1":"0.19313663635563827"}
36 1,0,{"0":"0.8068633636443617","1":"0.19313663635563827"}
37 1,0,{"0":"0.8068633636443617","1":"0.19313663635563827"}
38 1,0,{"0":"0.8068633636443617","1":"0.19313663635563827"}
39 1,0,{"0":"0.8068633636443617","1":"0.19313663635563827"}
40 1,0,{"0":"0.8063115847658164","1":"0.19368841523418356"}
41 1,0,{"0":"0.8063115847658164","1":"0.19368841523418356"}
42 1,0,{"0":"0.8063115847658164","1":"0.19368841523418356"}
......