The data is structured as follows. There will be directories in the format of ``` networks//// ``` where: - ``: The input clustering of the input data - `leiden-cpm-0.1`: Leiden clustering optimizing the CPM with resolution 0.1, i.e., Leiden-CPM(0.1) - `leiden-cpm-0.01`: Leiden clustering optimizing the CPM with resolution 0.01, i.e., Leiden-CPM(0.01) - `leiden-cpm-0.001`: Leiden clustering optimizing the CPM with resolution 0.001, i.e., Leiden-CPM(0.001) - `leiden-mod`: Leiden clustering optimizing modularity, i.e., Leiden-Mod - `sbm+cc`: flat SBM computed using graph-tool with the lowest description length, followed by CC, i.e., SBM+CC - `leiden-cpm-0.1+cm`: Leiden clustering optimizing the CPM with resolution 0.1, followed by CM, i.e., Leiden-CPM(0.1)+CM - `leiden-cpm-0.01+cm`: Leiden clustering optimizing the CPM with resolution 0.01, followed by CM, i.e., Leiden-CPM(0.01)+CM - `leiden-cpm-0.001+cm`: Leiden clustering optimizing the CPM with resolution 0.001, followed by CM, i.e., Leiden-CPM(0.001)+CM - `leiden-mod+cm`: Leiden clustering optimizing modularity, followed by CM, i.e., Leiden-Mod+CM - `sbm+wcc`: flat SBM computed using graph-tool with the lowest description length, followed by WCC, i.e., SBM+WCC - ``: The identifier of the network - e.g., `dnc`, `academia_edu`, `hyves`, etc. - ``: The identifier of the run (only 1 per clustered network in this dataset) - `0`: the only run in this dataset Each directory contains the following files: - `edge.tsv`: The edge list of the network with two tab-separated values (`node1`, `node2`) indicating the two nodes connected by the edge. The network is undirected, so each edge is expected to appear only once. - `com.tsv`: The community assignment of the nodes in the network, with two tab-separated values (`node`, `community`) indicating the community to which each node belongs.