barRoso::barroso_cat()barroso_cat
Description
Merges herbarium records from two or more biodiversity data sources into a single harmonized data frame. Optionally prioritizes specific sources when duplicates are detected across herbaria, retaining records based on a flexible exclusion strategy. The function keeps non-Brazilian herbaria records by default, assuming higher completeness from global repositories.
Details
This function aligns column structures, removes redundant records from overlapping herbaria, and merges all sources into a single output. Duplicate filtering is based on matching collectionCode across sources. Users can specify a preferred source (keep_source) when duplicates exist.
Arguments
| Argument | Description |
|---|---|
| list_sources | A named list of data frames. Each element represents a herbarium data source. The names of the list are used to track the source origin for internal filtering. |
| keep_source | Optional character string specifying the preferred data source (e.g., “GBIF”) for resolving duplicate collectionCode conflicts. If NULL, all records are retained. |
Value
A harmonized data frame combining all provided herbarium sources, with columns aligned and optionally filtered to resolve duplicate collections.
Examples
combined_df <- barroso_cat(list_sources = list(GBIF = gbif_data,
speciesLink = splink_data,
JABOT = jabot_data),
keep_source = "GBIF")