std_place

Standardize Place-Related Columns in Biodiversity Data
barRoso::std_place()

Description

Cleans and standardizes the geographic columns of a biodiversity collection dataset. This includes unifying column names and harmonizing values for continent, country, stateProvince, county, municipality, and locality. The function handles translations, synonyms, upper-case anomalies, ISO country codes, and common geographic aliases.

Details

This function is used internally by the barRoso package to support record reconciliation, duplicate detection, and label generation across different biodiversity databases. It ensures consistency of location fields by correcting common mistakes and variations. Country names are translated to English and harmonized using countrycode Brazilian and U.S. state abbreviations are expanded to full names.

Arguments

Argument Description
df A data frame containing biodiversity records.
colname_continent Column name for continent (default: "continent").
colname_country Column name for country (default: "country").
colname_stateProvince Column name for state or province (default: "stateProvince").
colname_county Column name for county (default: "county").
colname_municipality Column name for municipality (default: "municipality").
colname_locality Column name for locality (default: "locality").
rm_original_column Logical; if TRUE, original columns are removed after cleaning (default: TRUE).

Value

A data frame with standardized geographic information. If rm_original_column = FALSE, the original columns are retained with *Original suffixes.

Examples

df <- read.csv("herbarium_records.csv")
df_clean <- std_place(df,
                      colname_country = "pais",
                      colname_stateProvince = "estado",
                      rm_original_column = FALSE)