Java 类名:com.alibaba.alink.operator.batch.graph.ModularityCalBatchOp
Python 类名:ModularityCalBatchOp
模块度是一种评价图社群划分好坏的指标,来评估网络结构中划分出来社区的紧密程度,取值范围为-0.5到1之间。通常认为大于0.3的图是划分出较为明显社群的。
要计算一个网络的模块度,需要构造一个具有相同节点度分布的随机网络作为参照。通俗地来说,模块度的物理含义是:在社团内,实际的边数与随机情况下的边数的差距。如果差距比较大,说明社团内部密集程度显著高于随机情况,社团划分的质量较好。模块度取值范围在[-0.5,1]之间。如果节点组中的连边数量超过了随机分配时所得到的期望连边数量,模块度为正数。没有超过,则为负数。
| 名称 | 中文名称 | 描述 | 类型 | 是否必须? | 取值范围 | 默认值 |
|---|---|---|---|---|---|---|
| edgeSourceCol | 边表中起点所在列 | 边表中起点所在列 | String | ✓ | ||
| edgeTargetCol | 边表中终点所在列 | 边表中终点所在列 | String | ✓ | ||
| vertexCol | 点列 | 输入点表中点信息所在列 | String | ✓ | ||
| vertexCommunityCol | 社群信息列 | 输入点表中点的社群信息所在列 | String | ✓ | 所选列类型为 [INTEGER, LONG] | |
| asUndirectedGraph | 是否为无向图 | 是否为无向图 | Boolean | true | ||
| edgeWeightCol | 边权重列 | 表示边权重的列 | String | null |
from pyalink.alink import *
import pandas as pd
useLocalEnv(1)
df = pd.DataFrame([[3, 1],\
[3, 0],\
[0, 1],\
[0, 2],\
[2, 1],\
[2, 4],\
[5, 4],\
[7, 4],\
[5, 6],\
[5, 8],\
[5, 7],\
[7, 8],\
[6, 8],\
[12, 10],\
[12, 11],\
[12, 13],\
[12, 9],\
[10, 9],\
[8, 9],\
[13, 9],\
[10, 7],\
[10, 11],\
[11, 13]])
edges = BatchOperator.fromDataframe(df, schemaStr="source int, target int")
df2 = pd.DataFrame([[2, 0],\
[4, 1],\
[7, 1],\
[8, 1],\
[9, 2],\
[10, 2]])
vertices = BatchOperator.fromDataframe(df2, schemaStr="vertex int, label bigint")
communityDetectionClassify = CommunityDetectionClassifyBatchOp()\
.setEdgeSourceCol("source")\
.setEdgeTargetCol("target")\
.setVertexCol("vertex")\
.setVertexLabelCol("label")
community = communityDetectionClassify.linkFrom(edges, vertices)
modularityCal = ModularityCalBatchOp()\
.setEdgeSourceCol("source")\
.setEdgeTargetCol("target")\
.setVertexCol("vertex")\
.setVertexCommunityCol("label")
modularityCal.linkFrom(edges, community).print()
import org.apache.flink.types.Row;
import com.alibaba.alink.operator.batch.BatchOperator;
import com.alibaba.alink.operator.batch.source.MemSourceBatchOp;
import com.alibaba.alink.testutil.AlinkTestBase;
import org.junit.Test;
import java.util.Arrays;
import java.util.List;
public class ModularityCalBatchOpTest extends AlinkTestBase {
@Test
public void testModularityCal() throws Exception {
List <Row> edgesList = Arrays.asList(
Row.of(3, 1),
Row.of(3, 0),
Row.of(0, 1),
Row.of(0, 2),
Row.of(2, 1),
Row.of(2, 4),
Row.of(5, 4),
Row.of(7, 4),
Row.of(5, 6),
Row.of(5, 8),
Row.of(5, 7),
Row.of(7, 8),
Row.of(6, 8),
Row.of(12, 10),
Row.of(12, 11),
Row.of(12, 13),
Row.of(12, 9),
Row.of(10, 9),
Row.of(8, 9),
Row.of(13, 9),
Row.of(10, 7),
Row.of(10, 11),
Row.of(11, 13));
BatchOperator edges = new MemSourceBatchOp(edgesList, "source int, target int");
List <Row> nodesList = Arrays.asList(Row.of(2, 0),
Row.of(4, 1),
Row.of(7, 1),
Row.of(8, 1),
Row.of(9, 2),
Row.of(10, 2));
BatchOperator nodes = new MemSourceBatchOp(nodesList, "vertex int, label int");
CommunityDetectionClassifyBatchOp communityDetectionClassify = new CommunityDetectionClassifyBatchOp()
.setEdgeSourceCol("source")
.setEdgeTargetCol("target")
.setVertexCol("vertex")
.setVertexLabelCol("label")
.linkFrom(edges, nodes);
ModularityCalBatchOp modularityCal = new ModularityCalBatchOp()
.setSourceCol("source")
.setTargetCol("target")
.setVertexCol("vertex")
.setVertexCommunityCol("label");
modularityCal.linkFrom(edges, communityDetectionClassify).print();
}
}
| modularity |
|---|
| 0.522684 |