本节讨论在流式预测和LocalPredictor嵌入式预测场景如何使用模型流,可通过如下三种方式实现:
BatchOperator initModel = new AkSourceBatchOp() .setFilePath(DATA_DIR + INIT_NUMERIC_LR_MODEL_FILE); StreamOperator <?> predResult = new CsvSourceStreamOp() .setFilePath("http://alink-release.oss-cn-beijing.aliyuncs.com/data-files/avazu-ctr-train-8M.csv") .setSchemaStr(SCHEMA_STRING) .setIgnoreFirstLine(true) .link( new LogisticRegressionPredictStreamOp(initModel) .setPredictionCol(PREDICTION_COL_NAME) .setReservedCols(new String[] {LABEL_COL_NAME}) .setPredictionDetailCol(PRED_DETAIL_COL_NAME) .setModelStreamFilePath(DATA_DIR + FTRL_MODEL_STREAM_DIR) ); predResult .sample(0.0001) .select("'Pred Sample' AS out_type, *") .print(); predResult .link( new EvalBinaryClassStreamOp() .setLabelCol(LABEL_COL_NAME) .setPredictionDetailCol(PRED_DETAIL_COL_NAME) .setTimeInterval(10) ) .link( new JsonValueStreamOp() .setSelectedCol("Data") .setReservedCols(new String[] {"Statistics"}) .setOutputCols(new String[] {"Accuracy", "AUC", "ConfusionMatrix"}) .setJsonPath(new String[] {"$.Accuracy", "$.AUC", "$.ConfusionMatrix"}) ) .select("'Eval Metric' AS out_type, *") .print(); StreamOperator.execute();
BatchOperator initModel = new AkSourceBatchOp() .setFilePath(DATA_DIR + INIT_NUMERIC_LR_MODEL_FILE); PipelineModel pipelineModel = new PipelineModel( new LogisticRegressionModel() .setModelData(initModel) .setPredictionCol(PREDICTION_COL_NAME) .setReservedCols(new String[] {LABEL_COL_NAME}) .setPredictionDetailCol(PRED_DETAIL_COL_NAME) .setModelStreamFilePath(DATA_DIR + FTRL_MODEL_STREAM_DIR) ); pipelineModel.save(DATA_DIR + LR_PIPELINEMODEL_FILE, true); BatchOperator.execute(); StreamOperator <?> predResult = pipelineModel .transform( new CsvSourceStreamOp() .setFilePath( "http://alink-release.oss-cn-beijing.aliyuncs.com/data-files/avazu-ctr-train-8M.csv") .setSchemaStr(SCHEMA_STRING) .setIgnoreFirstLine(true) ); predResult .sample(0.0001) .select("'Pred Sample' AS out_type, *") .print(); predResult .link( new EvalBinaryClassStreamOp() .setLabelCol(LABEL_COL_NAME) .setPredictionDetailCol(PRED_DETAIL_COL_NAME) .setTimeInterval(10) ) .link( new JsonValueStreamOp() .setSelectedCol("Data") .setReservedCols(new String[] {"Statistics"}) .setOutputCols(new String[] {"Accuracy", "AUC", "ConfusionMatrix"}) .setJsonPath(new String[] {"$.Accuracy", "$.AUC", "$.ConfusionMatrix"}) ) .select("'Eval Metric' AS out_type, *") .print(); StreamOperator.execute();
Object[] input = new Object[] { "10000949271186029916", "1", "14102100", "1005", 0, "1fbe01fe", "f3845767", "28905ebd", "ecad2386", "7801e8d9", "07d7df22", "a99f214a", "37e8da74", "5db079b5", "1", "2", 15707, 320, 50, 1722, 0, 35, -1, 79}; LocalPredictor localPredictor = new LocalPredictor(DATA_DIR + LR_PIPELINEMODEL_FILE, SCHEMA_STRING); for (int i = 1; i <= 100; i++) { System.out.print(i + "\t"); System.out.println(ArrayUtils.toString(localPredictor.predict(input))); Thread.sleep(2000); } localPredictor.close();
本代码对应Chap29Pred.c_3_3()方法,运行此方法的同时,运行Chap29.c_2()方法。则此方法打印输出如下信息。由于本方法一直都是预测同样的数据,在模型没有发生变化的时候,预测结果是一样的;预测结果发生变化,也就意味着模型已经更新。从下面的内容看,预测结果发生多次变化,模型流起到了作用。
0 1,0,{"0":"0.8059634797835652","1":"0.1940365202164348"} 1 1,0,{"0":"0.8059634797835652","1":"0.1940365202164348"} 2 1,0,{"0":"0.8059634797835652","1":"0.1940365202164348"} 3 1,0,{"0":"0.8059634797835652","1":"0.1940365202164348"} 4 1,0,{"0":"0.8059634797835652","1":"0.1940365202164348"} 5 1,0,{"0":"0.8059634797835652","1":"0.1940365202164348"} 6 1,0,{"0":"0.8059634797835652","1":"0.1940365202164348"} 7 1,0,{"0":"0.8059634797835652","1":"0.1940365202164348"} 8 1,0,{"0":"0.8059634797835652","1":"0.1940365202164348"} 9 1,0,{"0":"0.8059634797835652","1":"0.1940365202164348"} 10 1,0,{"0":"0.8059634797835652","1":"0.1940365202164348"} 11 1,0,{"0":"0.8059634797835652","1":"0.1940365202164348"} 12 1,0,{"0":"0.8059634797835652","1":"0.1940365202164348"} 13 1,0,{"0":"0.8059634797835652","1":"0.1940365202164348"} 14 1,0,{"0":"0.8059634797835652","1":"0.1940365202164348"} 15 1,0,{"0":"0.8059634797835652","1":"0.1940365202164348"} 16 1,0,{"0":"0.8059634797835652","1":"0.1940365202164348"} 17 1,0,{"0":"0.8059634797835652","1":"0.1940365202164348"} 18 1,0,{"0":"0.8059634797835652","1":"0.1940365202164348"} 19 1,0,{"0":"0.8059634797835652","1":"0.1940365202164348"} 20 1,0,{"0":"0.8059059300425273","1":"0.1940940699574727"} 21 1,0,{"0":"0.8059059300425273","1":"0.1940940699574727"} 22 1,0,{"0":"0.8059059300425273","1":"0.1940940699574727"} 23 1,0,{"0":"0.8059059300425273","1":"0.1940940699574727"} 24 1,0,{"0":"0.8059059300425273","1":"0.1940940699574727"} 25 1,0,{"0":"0.8059057632191885","1":"0.1940942367808115"} 26 1,0,{"0":"0.8059057632191885","1":"0.1940942367808115"} 27 1,0,{"0":"0.8059057632191885","1":"0.1940942367808115"} 28 1,0,{"0":"0.8059057632191885","1":"0.1940942367808115"} 29 1,0,{"0":"0.8059057632191885","1":"0.1940942367808115"} 30 1,0,{"0":"0.8059057632191885","1":"0.1940942367808115"} 31 1,0,{"0":"0.8059057632191885","1":"0.1940942367808115"} 32 1,0,{"0":"0.8059057632191885","1":"0.1940942367808115"} 33 1,0,{"0":"0.8059057632191885","1":"0.1940942367808115"} 34 1,0,{"0":"0.8059057632191885","1":"0.1940942367808115"} 35 1,0,{"0":"0.8068633636443617","1":"0.19313663635563827"} 36 1,0,{"0":"0.8068633636443617","1":"0.19313663635563827"} 37 1,0,{"0":"0.8068633636443617","1":"0.19313663635563827"} 38 1,0,{"0":"0.8068633636443617","1":"0.19313663635563827"} 39 1,0,{"0":"0.8068633636443617","1":"0.19313663635563827"} 40 1,0,{"0":"0.8063115847658164","1":"0.19368841523418356"} 41 1,0,{"0":"0.8063115847658164","1":"0.19368841523418356"} 42 1,0,{"0":"0.8063115847658164","1":"0.19368841523418356"} ......