AutoGarch (AutoGarchBatchOp)

Java 类名：com.alibaba.alink.operator.batch.timeseries.AutoGarchBatchOp

Python 类名：AutoGarchBatchOp

功能介绍

给定分组，对每一组的数据使用AutoGarch进行时间序列预测。

算法原理

garch(Generalized AutoRegressive Conditional Heteroskedasticity) 又称广义自回归条件异方差模型,

garch 详细介绍请见链接 https://en.wikipedia.org/wiki/Autoregressive_conditional_heteroskedasticity#GARCH

garch是只需要指定MaxOrder, 不需要指定p/d/q, 对每个分组分别计算出最优的参数，给出预测结果。

使用方式

参考文档 https://www.yuque.com/pinshu/alink_guide/xbp5ky

参数说明

名称	中文名称	描述	类型	是否必须？	取值范围	默认值
predictionCol	预测结果列名	预测结果列名	String	✓
valueCol	value列，类型为MTable	value列，类型为MTable	String	✓	所选列类型为 [M_TABLE, STRING]
icType	评价指标	评价指标	String		“AIC”, “BIC”, “HQIC”	“AIC”
ifGARCH11	是否用garch11	是否用garch11	Boolean			true
maxOrder	模型(p, q)上限	模型(p, q)上限	Integer			10
minusMean	是否减去均值	是否减去均值	Boolean			true
predictNum	预测条数	预测条数	Integer			1
predictionDetailCol	预测详细信息列名	预测详细信息列名	String
reservedCols	算法保留列名	算法保留列	String[]			null
numThreads	组件多线程线程个数	组件多线程线程个数	Integer			1

代码示例

Python 代码

from pyalink.alink import *

import pandas as pd

useLocalEnv(1)

import time, datetime
import numpy as np
import pandas as pd

data = pd.DataFrame([
			[1,  datetime.datetime.fromtimestamp(1), 10.0],
			[1,  datetime.datetime.fromtimestamp(2), 11.0],
			[1,  datetime.datetime.fromtimestamp(3), 12.0],
			[1,  datetime.datetime.fromtimestamp(4), 13.0],
			[1,  datetime.datetime.fromtimestamp(5), 14.0],
			[1,  datetime.datetime.fromtimestamp(6), 15.0],
			[1,  datetime.datetime.fromtimestamp(7), 16.0],
			[1,  datetime.datetime.fromtimestamp(8), 17.0],
			[1,  datetime.datetime.fromtimestamp(9), 18.0],
			[1,  datetime.datetime.fromtimestamp(10), 19.0]
])

source = dataframeToOperator(data, schemaStr='id int, ts timestamp, val double', op_type='batch')

source.link(
        GroupByBatchOp()
			.setGroupByPredicate("id")
			.setSelectClause("id, mtable_agg(ts, val) as data")
		).link(
		AutoGarchBatchOp()
				.setValueCol("data")
				.setIcType("AIC")
				.setPredictNum(10)
				.setMaxOrder(4)
				.setIfGARCH11(True)
				.setMinusMean(False)
				.setPredictionCol("pred")
		).print()

Java 代码

package com.alibaba.alink.operator.batch.timeseries;

import org.apache.flink.types.Row;

import com.alibaba.alink.operator.batch.source.MemSourceBatchOp;
import com.alibaba.alink.operator.batch.sql.GroupByBatchOp;
import com.alibaba.alink.testutil.AlinkTestBase;
import org.junit.Test;

import java.sql.Timestamp;
import java.util.Arrays;
import java.util.List;

public class AutoGarchBatchOpTest extends AlinkTestBase {

	@Test
	public void test() throws Exception {
		List <Row> mTableData = Arrays.asList(
			Row.of(1, new Timestamp(1), 10.0),
			Row.of(1, new Timestamp(2), 11.0),
			Row.of(1, new Timestamp(3), 12.0),
			Row.of(1, new Timestamp(4), 13.0),
			Row.of(1, new Timestamp(5), 14.0),
			Row.of(1, new Timestamp(6), 15.0),
			Row.of(1, new Timestamp(7), 16.0),
			Row.of(1, new Timestamp(8), 17.0),
			Row.of(1, new Timestamp(9), 18.0),
			Row.of(1, new Timestamp(10), 19.0)
		);

		MemSourceBatchOp source = new MemSourceBatchOp(mTableData, new String[] {"id", "ts", "val"});

		source.link(
			new GroupByBatchOp()
				.setGroupByPredicate("id")
				.setSelectClause("mtable_agg(ts, val) as data")
		).link(
			new AutoGarchBatchOp()
				.setValueCol("data")
				.setIcType("AIC")
				.setPredictNum(10)
				.setMaxOrder(4)
				.setIfGARCH11(true)
				.setMinusMean(false)
				.setPredictionCol("pred")
		).print();

	}

}

运行结果

data	pred
MTable(10,2)(ts,val)	MTable(10,2)(ts,val)
1970-01-01 00:00:00.001 10.0000	1970-01-01 00:00:00.011 16.6379
1970-01-01 00:00:00.002 11.0000	1970-01-01 00:00:00.012 13.7129
1970-01-01 00:00:00.003 12.0000	1970-01-01 00:00:00.013 11.6890
1970-01-01 00:00:00.004 13.0000	1970-01-01 00:00:00.014 13.0115
1970-01-01 00:00:00.005 14.0000	1970-01-01 00:00:00.015 20.8482

ALinkLab