This function calculates the number of references that different articles share together, as well as the coupling angle value of edges in a bibliographic coupling network (Sen and Gan 1983) , from a direct citation data frame. This is a standard way to build bibliographic coupling network using Salton's cosine measure: it divides the number of references that two articles share by the square root of the product of both articles bibliography lengths. It avoids giving too much importance to articles with a large bibliography.
biblio_coupling( dt, source, ref, normalized_weight_only = TRUE, weight_threshold = 1, output_in_character = TRUE )
dt | For bibliographic coupling (or co-citation), the dataframe with citing and cited documents. It could also be used
|
---|---|
source | The column name of the source identifiers, that is the documents that are citing. In a coupling network, these documents are the nodes of the network. |
ref | The column name of the cited references identifiers. |
normalized_weight_only | If set to FALSE, the function returns the weights normalized by the cosine measure, but also the number of shared references. |
weight_threshold | Corresponds to the value of the non-normalized weights of edges. The function just keeps the edges
that have a non-normalized weight superior to the |
output_in_character | If TRUE, the function ends by transforming the |
A data.table with the articles (or authors) identifiers in from
and to
columns,
with one or two additional columns (the coupling angle measure and the number of shared references).
It also keeps a copy of from
and to
in the Source
and Target
columns. This is useful is you
are using the tidygraph package after, where from
and to
values are modified when creating a graph.
This function implements the following weight measure:
$$\frac{R(A) \bullet R(B)}{\sqrt{L(A).L(B)}}$$
with \(R(A)\) and \(R(B)\) the references of document A and document B, \(R(A) \bullet R(B)\) being the number of shared references by A and B, and \(L(A)\) and \(L(B)\) the length of the bibliographies of document A and document B.
This function uses data.table package and is thus very fast. It allows the user to compute the coupling angle on a very large data frame quickly.
This function is a relatively general function that can also be used
for co-citation networks (just by inversing the source
and ref
columns).
If you want to avoid confusion, rather use the biblio_cocitation()
function.
for title co-occurence networks (taking care of the length of the title thanks to the coupling angle measure);
for co-authorship networks (taking care of the
number of co-authors an author has collaborated with on a period). For co-authorship,
rather use the coauth_network()
function.
Sen SK, Gan SK (1983). “A Mathematical Extension of the Idea of Bibliographic Coupling and Its Applications.” Annals of library science and documentation, 30(2). http://nopr.niscair.res.in/bitstream/123456789/28008/1/ALIS%2030(2)%2078-82.pdf.
library(biblionetwork) biblio_coupling(Ref_stagflation, source = "Citing_ItemID_Ref", ref = "ItemID_Ref", weight_threshold = 3)#> from to weight Source Target #> 1: 214927 2207578 0.14605935 214927 2207578 #> 2: 214927 8456979 0.09733285 214927 8456979 #> 3: 214927 10729971 0.29848100 214927 10729971 #> 4: 214927 19627977 0.11202241 214927 19627977 #> 5: 1021902 12824456 0.06537205 1021902 12824456 #> --- #> 958: 1111111147 1111111156 0.17325923 1111111147 1111111156 #> 959: 1111111147 1111111161 0.13333938 1111111147 1111111161 #> 960: 1111111156 1111111161 0.08580846 1111111156 1111111161 #> 961: 1111111159 1111111171 0.24333213 1111111159 1111111171 #> 962: 1111111182 1111111183 0.27060404 1111111182 1111111183