Skip to contents






A data frame of references to be cleaned, as produced by find_ref_to_df() or parse_ref_to_df().


A cleaned and reorganized data frame with one row per reference and additional details.


This function tidies up the data frame of references obtained from find_ref_to_df() or parse_ref_to_df(). It performs various cleaning and reorganization tasks to present the data in a more structured and readable format.

The function performs the following steps:

  • Collapses non-essential columns (i.e. other than date and title), combining multiple entries into a single text string.

  • Reorganizes author information, creating a full name from given, particle, and family components. Also add an author_order column. The column will remain in a list format.

  • Separates and cleans dates and titles, organizing them into primary and additional information. The first date and first title are save in date and title. The other dates and titles are saved in other_date and other_title.

  • Extracts the year from the date and places it in a separate column.

  • Removes extraneous punctuation and trims whitespace from character columns.

  • Relocates key columns to a more standardized order for easier analysis and readability.

This function can be used directly, or indirectly through find_ref_to_df() or parse_ref_to_df() by setting the clean_ref parameter to TRUE.