::barroso_cat() barRoso
barroso_cat
Description
Merges herbarium records from two or more biodiversity data sources into a single harmonized data frame. Optionally prioritizes specific sources when duplicates are detected across herbaria, retaining records based on a flexible exclusion strategy. The function keeps non-Brazilian herbaria records by default, assuming higher completeness from global repositories.
Details
This function aligns column structures, removes redundant records from overlapping herbaria, and merges all sources into a single output. Duplicate filtering is based on matching collectionCode
across sources. Users can specify a preferred source (keep_source
) when duplicates exist.
Arguments
Argument | Description |
---|---|
list_sources | A named list of data frames. Each element represents a herbarium data source. The names of the list are used to track the source origin for internal filtering. |
keep_source | Optional character string specifying the preferred data source (e.g., “GBIF”) for resolving duplicate collectionCode conflicts. If NULL, all records are retained. |
Value
A harmonized data frame combining all provided herbarium sources, with columns aligned and optionally filtered to resolve duplicate collections.
Examples
<- barroso_cat(list_sources = list(GBIF = gbif_data,
combined_df speciesLink = splink_data,
JABOT = jabot_data),
keep_source = "GBIF")