Create a data.frame suitable for alluvial graph projection
Source:R/networks_to_alluv.R
networks_to_alluv.Rd
This function creates a data.frame that can be easily plotted with ggalluvial from a list of networks.
Usage
networks_to_alluv(
graphs,
intertemporal_cluster_column,
node_id,
summary_cluster_stats = TRUE,
keep_color = TRUE,
color_column = "color",
keep_cluster_label = TRUE,
cluster_label_column = "cluster_label"
)
Arguments
- graphs
A tibble graph from tidygraph or a list of tibble graphs.
- intertemporal_cluster_column
The column with the identifier of the inter-temporal cluster. If you have used add_clusters() and merge_dynamic_clusters(), it is of the form
dynamic_cluster_{clustering_method}
.- node_id
The column with the unique identifier of each node.
- summary_cluster_stats
If set to
TRUE
, the data.frame will contain a list of variable that summarize cluster statistics of the alluvial. These variables can be particularly useful to filter smaller communities when plotting according to different variables:share_cluster_alluv
is the percentage share of a given cluster across all time windows;share_cluster_window
is the percentage share of a given cluster in a given time window;share_cluster_max
is the highest value ofshare_cluster_window
for a given cluster across all individual time windows;length_cluster
is the number of time windows a cluster exists.
- keep_color
Set to
TRUE
(by default) if you want to keep the column with the color associated to the different categories ofintertemporal_cluster_column
. Such a column exists in your list of tibble graphs if you have use color_networks().- color_column
The name of the column with the colors of the categories in
intertemporal_cluster_column
. By default, "color", as it is the column name resulting from the use of color_networks().- keep_cluster_label
Set to
TRUE
if you want to keep the column with a name/label associated to the different categories ofintertemporal_cluster_column
. Such a column exists in your list of tibble graphs if you have use name_clusters().- cluster_label_column
The name of the column with the name/label associated to the categories in
intertemporal_cluster_column
. By default, "cluster_label", as it is the column name resulting from the use of name_clusters().
Examples
library(networkflow)
nodes <- Nodes_stagflation |>
dplyr::rename(ID_Art = ItemID_Ref) |>
dplyr::filter(Type == "Stagflation")
references <- Ref_stagflation |>
dplyr::rename(ID_Art = Citing_ItemID_Ref)
temporal_networks <- build_dynamic_networks(nodes = nodes,
directed_edges = references,
source_id = "ID_Art",
target_id = "ItemID_Ref",
time_variable = "Year",
cooccurrence_method = "coupling_similarity",
time_window = 20,
edges_threshold = 1,
overlapping_window = TRUE,
filter_components = TRUE,
verbose = FALSE)
temporal_networks <- add_clusters(temporal_networks,
objective_function = "modularity",
clustering_method = "leiden",
verbose = FALSE)
temporal_networks <- merge_dynamic_clusters(temporal_networks,
cluster_id = "cluster_leiden",
node_id = "ID_Art",
threshold_similarity = 0.51,
similarity_type = "partial")
temporal_networks <- name_clusters(graphs = temporal_networks,
method = "tf-idf",
name_merged_clusters = TRUE,
cluster_id = "dynamic_cluster_leiden",
text_columns = "Title",
nb_terms_label = 5,
clean_word_method = "lemmatise")
#> Warning: A shallow copy of this data.table was taken so that := can add or remove 2 columns by reference. At an earlier point, this data.table was copied by R (or was created manually using structure() or similar). Avoid names<- and attr<- which in R currently (and oddly) may copy the whole data.table. Use set* syntax instead to avoid copying: ?set, ?setnames and ?setattr. It's also not unusual for data.table-agnostic packages to produce tables affected by this issue. If this message doesn't help, please report your use case to the data.table issue tracker so the root cause can be fixed or this message improved.
temporal_networks <- color_networks(graphs = temporal_networks,
column_to_color = "dynamic_cluster_leiden",
color = NULL)
#> ℹ unique_color_across_list has been set to FALSE. There are 14 different categories to color.
#> ℹ color is neither a vector of color characters, nor a data.frame. We will proceed with base R colors.
#> ℹ We draw 7 colors from the ggplot2 palette and 7 colors from the Okabe-Ito palette.
alluv_dt <- networks_to_alluv(temporal_networks,
intertemporal_cluster_column = "dynamic_cluster_leiden",
node_id = "ID_Art")
alluv_dt[1:5]
#> dynamic_cluster_leiden window ID_Art color
#> <char> <char> <char> <char>
#> 1: cl_1 1975-1994 16182155 #F564E3
#> 2: cl_1 1975-1994 26283591 #F564E3
#> 3: cl_1 1975-1994 31895842 #F564E3
#> 4: cl_1 1975-1994 1111111131 #F564E3
#> 5: cl_1 1975-1994 1111111150 #F564E3
#> cluster_label share_cluster_alluv
#> <char> <num>
#> 1: controls, controls program, program, price level, level 6.22
#> 2: controls, controls program, program, price level, level 6.22
#> 3: controls, controls program, program, price level, level 6.22
#> 4: controls, controls program, program, price level, level 6.22
#> 5: controls, controls program, program, price level, level 6.22
#> share_cluster_window share_cluster_max length_cluster y_alluv
#> <num> <num> <int> <num>
#> 1: 17.57 18.31 7 0.01351351
#> 2: 17.57 18.31 7 0.01351351
#> 3: 17.57 18.31 7 0.01351351
#> 4: 17.57 18.31 7 0.01351351
#> 5: 17.57 18.31 7 0.01351351