Java 类名:com.alibaba.alink.operator.batch.associationrule.ApplyAssociationRuleBatchOp
Python 类名:ApplyAssociationRuleBatchOp
| 名称 | 中文名称 | 描述 | 类型 | 是否必须? | 取值范围 | 默认值 | 
|---|---|---|---|---|---|---|
| selectedCol | 选中的列名 | 计算列对应的列名 | String | ✓ | 所选列类型为 [STRING] | |
| modelFilePath | 模型的文件路径 | 模型的文件路径 | String | null | ||
| outputCol | 输出结果列 | 输出结果列列名,可选,默认null | String | null | ||
| reservedCols | 算法保留列名 | 算法保留列 | String[] | null | ||
| numThreads | 组件多线程线程个数 | 组件多线程线程个数 | Integer | 1 | 
from pyalink.alink import *
import pandas as pd
useLocalEnv(1)
df = pd.DataFrame([
    ["A,B,C,D"],
    ["B,C,E"],
    ["A,B,C,E"],
    ["B,D,E"],
    ["A,B,C,D"],
])
data = BatchOperator.fromDataframe(df, schemaStr='items string')
fpGrowth = FpGrowthBatchOp() \
    .setItemsCol("items") \
    .setMinSupportPercent(0.4) \
    .setMinConfidence(0.6)
fpGrowth.linkFrom(data)
fpGrowth.print()
fpGrowth.getSideOutput(0).print()
ApplyAssociationRuleBatchOp()\
    .setSelectedCol("items") \
    .setOutputCol("result") \
    .linkFrom(fpGrowth.getSideOutput(0), data).print()
import org.apache.flink.types.Row;
import com.alibaba.alink.operator.batch.BatchOperator;
import com.alibaba.alink.operator.batch.source.MemSourceBatchOp;
import com.alibaba.alink.testutil.AlinkTestBase;
import org.junit.Test;
import java.util.Arrays;
import java.util.List;
public class ApplyAssociationRuleBatchOpTest {
	@Test
	public void testFpGrowth() throws Exception {
		List <Row> rows = Arrays.asList(
			Row.of("A,B,C,D"),
			Row.of("B,C,E"),
			Row.of("A,B,C,E"),
			Row.of("B,D,E"),
			Row.of("A,B,C,D")
			);
		BatchOperator data = new MemSourceBatchOp(rows, "items string");
		FpGrowthBatchOp fpGrowth = new FpGrowthBatchOp()
			.setItemsCol("items")
			.setMinSupportPercent(0.4)
			.setMinConfidence(0.6);
		fpGrowth.linkFrom(data);
		fpGrowth.print();
		fpGrowth.getSideOutputAssociationRules().print();
		ApplyAssociationRuleBatchOp op = new ApplyAssociationRuleBatchOp()
			.setSelectedCol("items")
			.setOutputCol("result")
			.linkFrom(fpGrowth.getSideOutputAssociationRules(), data);
		op.print();
	}
}
频繁项集输出:
| itemset | supportcount | itemcount | 
|---|---|---|
| E | 3 | 1 | 
| B,E | 3 | 2 | 
| C,E | 2 | 2 | 
| B,C,E | 2 | 3 | 
| D | 3 | 1 | 
| B,D | 3 | 2 | 
| C,D | 2 | 2 | 
| B,C,D | 2 | 3 | 
| A,D | 2 | 2 | 
| B,A,D | 2 | 3 | 
| C,A,D | 2 | 3 | 
| B,C,A,D | 2 | 4 | 
| A | 3 | 1 | 
| B,A | 3 | 2 | 
| C,A | 3 | 2 | 
| B,C,A | 3 | 3 | 
| C | 4 | 1 | 
| B,C | 4 | 2 | 
| B | 5 | 1 | 
关联规则输出:
| rule | itemcount | lift | support_percent | confidence_percent | transaction_count | 
|---|---|---|---|---|---|
| D=>B | 2 | 1.0000 | 0.6000 | 1.0000 | 3 | 
| D=>A | 2 | 1.1111 | 0.4000 | 0.6667 | 2 | 
| C,D=>B | 3 | 1.0000 | 0.4000 | 1.0000 | 2 | 
| A,D=>B | 3 | 1.0000 | 0.4000 | 1.0000 | 2 | 
| B,D=>A | 3 | 1.1111 | 0.4000 | 0.6667 | 2 | 
| A,D=>C | 3 | 1.2500 | 0.4000 | 1.0000 | 2 | 
| C,D=>A | 3 | 1.6667 | 0.4000 | 1.0000 | 2 | 
| C,A,D=>B | 4 | 1.0000 | 0.4000 | 1.0000 | 2 | 
| B,A,D=>C | 4 | 1.2500 | 0.4000 | 1.0000 | 2 | 
| B,C,D=>A | 4 | 1.6667 | 0.4000 | 1.0000 | 2 | 
| C=>A | 2 | 1.2500 | 0.6000 | 0.7500 | 3 | 
| C=>B | 2 | 1.0000 | 0.8000 | 1.0000 | 4 | 
| B,C=>A | 3 | 1.2500 | 0.6000 | 0.7500 | 3 | 
| E=>B | 2 | 1.0000 | 0.6000 | 1.0000 | 3 | 
| C,E=>B | 3 | 1.0000 | 0.4000 | 1.0000 | 2 | 
| B=>E | 2 | 1.0000 | 0.6000 | 0.6000 | 3 | 
| B=>D | 2 | 1.0000 | 0.6000 | 0.6000 | 3 | 
| B=>A | 2 | 1.0000 | 0.6000 | 0.6000 | 3 | 
| B=>C | 2 | 1.0000 | 0.8000 | 0.8000 | 4 | 
| A=>D | 2 | 1.1111 | 0.4000 | 0.6667 | 2 | 
| A=>B | 2 | 1.0000 | 0.6000 | 1.0000 | 3 | 
| A=>C | 2 | 1.2500 | 0.6000 | 1.0000 | 3 | 
| B,A=>D | 3 | 1.1111 | 0.4000 | 0.6667 | 2 | 
| C,A=>D | 3 | 1.1111 | 0.4000 | 0.6667 | 2 | 
| C,A=>B | 3 | 1.0000 | 0.6000 | 1.0000 | 3 | 
| B,A=>C | 3 | 1.2500 | 0.6000 | 1.0000 | 3 | 
| B,C,A=>D | 4 | 1.1111 | 0.4000 | 0.6667 | 2 | 
关联规则预测输出
| items | result | 
|---|---|
| A,B,C,D | E | 
| B,C,E | A,D | 
| A,B,C,E | D | 
| B,D,E | A,C | 
| A,B,C,D | E |