::std_taxa() barRoso
std_taxa
Description
Cleans and standardizes taxonomic fields in a biodiversity collection dataset. Specifically targets and harmonizes the family
, genus
, and specificEpithet
columns, correcting legacy naming (e.g. Leguminosae → Fabaceae), removing ambiguous entries, and formatting genus/species names for consistency.
Details
This function is part of the barRoso
package and is designed to improve the quality of taxon names for reconciliation, querying, and label generation. It removes common taxonomic noise such as uncertain identifiers (e.g. “cf.”, “aff.”, “indet.”), numeric placeholders, and genus-only labels mistakenly stored in the species field. Genus names are capitalized, and legacy family names (like Leguminosae
) are standardized to their accepted equivalents (e.g. Fabaceae
).
Arguments
Argument | Description |
---|---|
df | A data frame with biodiversity collection records. |
colname_family | Name of the column containing plant family names (default: "family" ). |
colname_genus | Name of the column containing genus names (default: "genus" ). |
colname_specificEpithet | Name of the column containing specific epithet of the species names (default: "specificEpithet" ). |
rm_original_column | Logical; if TRUE , original columns are removed after cleaning (default: TRUE ). |
Value
A data frame with cleaned and standardized family
, genus
, and specificEpithet
columns. If rm_original_column = FALSE
, original values are retained with a *Original suffix.
Examples
<- read.csv("taxa.csv")
df <- std_taxa(df,
df_clean colname_family = "familia",
colname_genus = "genero",
colname_specificEpithet = "especie",
rm_original_column = FALSE)