::std_place() barRoso
std_place
Description
Cleans and standardizes the geographic columns of a biodiversity collection dataset. This includes unifying column names and harmonizing values for continent
, country
, stateProvince
, county
, municipality
, and locality
. The function handles translations, synonyms, upper-case anomalies, ISO country codes, and common geographic aliases.
Details
This function is used internally by the barRoso
package to support record reconciliation, duplicate detection, and label generation across different biodiversity databases. It ensures consistency of location fields by correcting common mistakes and variations. Country names are translated to English and harmonized using countrycode
Brazilian and U.S. state abbreviations are expanded to full names.
Arguments
Argument | Description |
---|---|
df | A data frame containing biodiversity records. |
colname_continent | Column name for continent (default: "continent" ). |
colname_country | Column name for country (default: "country" ). |
colname_stateProvince | Column name for state or province (default: "stateProvince" ). |
colname_county | Column name for county (default: "county" ). |
colname_municipality | Column name for municipality (default: "municipality" ). |
colname_locality | Column name for locality (default: "locality" ). |
rm_original_column | Logical; if TRUE , original columns are removed after cleaning (default: TRUE ). |
Value
A data frame with standardized geographic information. If rm_original_column = FALSE
, the original columns are retained with *Original suffixes.
Examples
<- read.csv("herbarium_records.csv")
df <- std_place(df,
df_clean colname_country = "pais",
colname_stateProvince = "estado",
rm_original_column = FALSE)