Create a data.frame suitable for alluvial graph projection
Source:R/networks_to_alluv.R
networks_to_alluv.Rd
This function creates a data.frame that can be easily plotted with ggalluvial from a list of networks.
Usage
networks_to_alluv(
list_graph = NA,
intertemporal_cluster_column = "intertemporal_name",
node_id = NA,
summary_cl_stats = TRUE,
keep_color = TRUE,
color_column = "color",
keep_cluster_label = TRUE,
cluster_label_column = "cluster_label"
)
Arguments
- list_graph
Your list with all networks
- intertemporal_cluster_column
The column with the identifier of the inter-temporal cluster. If you have used add_clusters() and merge_dynamic_clusters(), it is of the form
dynamic_cluster_{clustering_method}
.- node_id
The column with the unique identifier of each node.
- summary_cl_stats
If set to
TRUE
, the data.frame will contain a list of variable that summarize cluster statistics of the alluvial. These variables can be particularly useful to filter smaller communities when plotting according to different variables:share_cluster_alluv
is the percentage share of a given cluster across all time windows;share_cluster_window
is the percentage share of a given cluster in a given time window;share_cluster_max
is the highest value ofshare_cluster_window
for a given cluster across all individual time windows;length_cluster
is the number of time windows a cluster exists.
- keep_color
Set to
TRUE
(by default) if you want to keep the column with the color associated to the different categories ofintertemporal_cluster_column
. Such a column exists in your list of tibble graphs if you have use color_networks().- color_column
The name of the column with the colors of the categories in
intertemporal_cluster_column
. By default, "color", as it is the column name resulting from the use of color_networks().- keep_cluster_label
Set to
TRUE
if you want to keep the column with a name/label associated to the different categories ofintertemporal_cluster_column
. Such a column exists in your list of tibble graphs if you have use name_clusters().- cluster_label_column
The name of the column with the name/label associated to the categories in
intertemporal_cluster_column
. By default, "cluster_label", as it is the column name resulting from the use of name_clusters().
Examples
library(networkflow)
nodes <- Nodes_stagflation |>
dplyr::rename(ID_Art = ItemID_Ref) |>
dplyr::filter(Type == "Stagflation")
references <- Ref_stagflation |>
dplyr::rename(ID_Art = Citing_ItemID_Ref)
temporal_networks <- build_dynamic_networks(nodes = nodes,
directed_edges = references,
source_id = "ID_Art",
target_id = "ItemID_Ref",
time_variable = "Year",
cooccurrence_method = "coupling_similarity",
time_window = 20,
edges_threshold = 1,
overlapping_window = TRUE,
filter_components = TRUE,
verbose = FALSE)
temporal_networks <- add_clusters(temporal_networks,
objective_function = "modularity",
clustering_method = "leiden",
verbose = FALSE)
temporal_networks <- merge_dynamic_clusters(temporal_networks,
cluster_id = "cluster_leiden",
node_id = "ID_Art",
threshold_similarity = 0.51,
similarity_type = "partial")
temporal_networks <- name_clusters(graphs = temporal_networks,
method = "tf-idf",
name_merged_clusters = TRUE,
cluster_id = "dynamic_cluster_leiden",
text_columns = "Title",
nb_terms_label = 5,
clean_word_method = "lemmatise")
#> Warning: Invalid .internal.selfref detected and fixed by taking a (shallow) copy of the data.table so that := can add this new column by reference. At an earlier point, this data.table has been copied by R (or was created manually using structure() or similar). Avoid names<- and attr<- which in R currently (and oddly) may copy the whole data.table. Use set* syntax instead to avoid copying: ?set, ?setnames and ?setattr. If this message doesn't help, please report your use case to the data.table issue tracker so the root cause can be fixed or this message improved.
#> Error in mutate(d_tmp, ...): ℹ In argument: `cluster_label = ifelse(is.na(eval(ensym(label_name))),
#> "no_name", eval(ensym(label_name)))`.
#> Caused by error in `ensym()`:
#> ! could not find function "ensym"
temporal_networks <- color_networks(graphs = temporal_networks,
column_to_color = "dynamic_cluster_leiden",
color = NULL)
#> ℹ unique_color_across_list has been set to FALSE. There are 16 different categories to color.
#> ℹ color is neither a vector of color characters, nor a data.frame. We will proceed with base R colors.
#> ℹ We draw 7 colors from the ggplot2 palette and 7 from the Okabe-Ito palette. As more than 14 colors are needed, the colors will be recycled.
alluv_dt <- networks_to_alluv(temporal_networks,
intertemporal_cluster_column = "dynamic_cluster_leiden",
node_id = "ID_Art")
#> ℹ The column "cluster_label" does not exist in the list of graphs provided. No name kept for clusters.
alluv_dt[1:5]
#> dynamic_cluster_leiden window ID_Art color share_cluster_alluv
#> <char> <char> <char> <char> <num>
#> 1: cl_1 1975-1994 16182155 #56B4E9 6.14
#> 2: cl_1 1975-1994 26283591 #56B4E9 6.14
#> 3: cl_1 1975-1994 31895842 #56B4E9 6.14
#> 4: cl_1 1975-1994 1111111131 #56B4E9 6.14
#> 5: cl_1 1975-1994 1111111150 #56B4E9 6.14
#> share_cluster_window share_cluster_max length_cluster y_alluv
#> <num> <num> <int> <num>
#> 1: 17.57 18.31 7 0.01351351
#> 2: 17.57 18.31 7 0.01351351
#> 3: 17.57 18.31 7 0.01351351
#> 4: 17.57 18.31 7 0.01351351
#> 5: 17.57 18.31 7 0.01351351