Get Started
Welcome new users! Start learning how to install barRoso
Articles
“How-to” articles to help you learn how to use barRoso
barRoso
is is an R package for standardizing and reconciling plant specimen data from herbarium collections and biodiversity repositories. It was originally developed for cleaning records from GBIF but is fully compatible with other sources, including SEINet, JABOT, REFLORA, and speciesLink. The name is both a tribute to Brazilian botanist Graziela Maciel Barroso and an acronym for Biodiversity Analysis and Record Reconciliation for Organizing Specimen Observations.
The package focuses on harmonizing collector names, standardizing taxonomic and geographic fields, and identifying duplicates across distributed herbarium records. It supports Darwin Core standards, integrates seamlessly with tidyverse workflows, and enables reproducible biodiversity data pipelines.
Unlike many packages that rely on static dictionaries to clean collector names, barRoso
uses a robust set of regular expressions (regex) to dynamically identify and standardize collector patterns across datasets. This approach allows barRoso
to generalize better across sources and spelling variations, even when names are inconsistently formatted. Combined with harmonization of geographic and taxonomic fields, this makes the package especially powerful for detecting duplicates and preparing data for downstream analysis.
While many data tools prioritize aggressive cleaning, often at the cost of discarding valuable records, barRoso
takes a different approach. Its philosophy centers on standardization rather than removal. All herbarium specimens carry potential scientific value, even when incomplete or inconsistently entered. Instead of omitting such records, barRoso
focuses on harmonizing fields to enhance comparability across collections. By standardizing collector names, geographic fields, and taxonomic labels, barRoso
allows users to flag rather than erase inconsistencies, enabling more transparent workflows and tracing potential misidentifications, especially across distributed duplicates. This inclusive approach honors the archival role of herbaria while facilitating reproducible biodiversity research.
Graziela Maciel Barroso (1912–2003) was one of Brazil’s most influential botanists, with a 58-year career at the Rio de Janeiro Botanical Garden. She specialized in plant taxonomy and morphology, shaping how generations of scientists understand and classify Brazil’s flora. Her work was instrumental in developing identification keys and guides used in herbaria across Brazil and abroad.
She authored the landmark three-volume Sistemática de Angiospermas do Brasil (1978–1986), still a cornerstone of botanical education, as well as Frutos e Sementes – Morfologia Aplicada à Sistemática de Dicotiledôneas (1999). Beyond her scientific contributions, she was a committed educator and mentor, and a courageous advocate for science and democracy during Brazil’s military dictatorship.
Barroso remained professionally active until her passing at age 92. A month before her death, she was elected to the Brazilian Academy of Sciences, a distinction awarded posthumously.
Welcome new users! Start learning how to install barRoso
“How-to” articles to help you learn how to use barRoso
:::
---
format:
html:
toc: true
page-layout: custom
resources:
- figures/
---
<style>
.floating-logos {
position: fixed;
top: 60px;
right: 450px;
width: 165px;
opacity: 0.85;
z-index: 1;
}
.floating-logos img {
display: block;
width: 100%;
margin-bottom: 12px;
border-radius: 4px;
}
</style>
<div class="floating-logos">
<img src='/figures/barroso_hex_sticker.png' alt='barRoso hex sticker'>
<img src='/figures/jbrj_marca.jpg' alt='JBRJ logo'>
</div>
::: whitebox
::: {style="padding-left: 100px; padding-right: 50px; display: inline-block;"}
::: {layout-ncol="2"}
::: {style="text-align: left;"}
\
`barRoso` is is an R package for standardizing and reconciling plant specimen data from herbarium collections and biodiversity repositories. It was originally developed for cleaning records from [GBIF](https://www.gbif.org) but is fully compatible with other sources, including [SEINet](https://swbiodiversity.org/seinet), [JABOT](https://jabot.jbrj.gov.br/v3/consulta.php), [REFLORA](https://floradobrasil.jbrj.gov.br/reflora), and [speciesLink](https://specieslink.net). The name is both a tribute to Brazilian botanist [Graziela Maciel Barroso](https://www.gov.br/jbrj/pt-br/assuntos/colecoes/arquivistica/graziela-maciel-barroso) and an acronym for **Biodiversity Analysis and Record Reconciliation for Organizing Specimen Observations**.\
\
The package focuses on harmonizing collector names, standardizing taxonomic and geographic fields, and identifying duplicates across distributed herbarium records. It supports Darwin Core standards, integrates seamlessly with tidyverse workflows, and enables reproducible biodiversity data pipelines.\
\
Unlike many packages that rely on static dictionaries to clean collector names, `barRoso` uses a robust set of regular expressions (regex) to dynamically identify and standardize collector patterns across datasets. This approach allows `barRoso` to generalize better across sources and spelling variations, even when names are inconsistently formatted. Combined with harmonization of geographic and taxonomic fields, this makes the package especially powerful for detecting duplicates and preparing data for downstream analysis.\
\
While many data tools prioritize aggressive cleaning, often at the cost of discarding valuable records, `barRoso` takes a different approach. Its philosophy centers on standardization rather than removal. All herbarium specimens carry potential scientific value, even when incomplete or inconsistently entered. Instead of omitting such records, `barRoso` focuses on harmonizing fields to enhance comparability across collections. By standardizing collector names, geographic fields, and taxonomic labels, `barRoso` allows users to flag rather than erase inconsistencies, enabling more transparent workflows and tracing potential misidentifications, especially across distributed duplicates. This inclusive approach honors the archival role of herbaria while facilitating reproducible biodiversity research.\
\
\
## Graziela Maciel Barroso
Graziela Maciel Barroso (1912–2003) was one of Brazil’s most influential botanists, with a 58-year career at the Rio de Janeiro Botanical Garden. She specialized in plant taxonomy and morphology, shaping how generations of scientists understand and classify Brazil’s flora. Her work was instrumental in developing identification keys and guides used in herbaria across Brazil and abroad.
She authored the landmark three-volume *Sistemática de Angiospermas do Brasil* (1978–1986), still a cornerstone of botanical education, as well as *Frutos e Sementes – Morfologia Aplicada à Sistemática de Dicotiledôneas* (1999). Beyond her scientific contributions, she was a committed educator and mentor, and a courageous advocate for science and democracy during Brazil’s military dictatorship.
Barroso remained professionally active until her passing at age 92. A month before her death, she was elected to the [Brazilian Academy of Sciences](https://www.abc.org.br/membro/graziela-maciel-barroso/), a distinction awarded posthumously.\
\
:::
::: {style="display: flex; gap: 16px; justify-content: center; align-items: center;"}
:::
:::
:::
:::
::: mainbox
::: {style="padding-left: 100px; padding-right: 100px; display: inline-block;"}
::: {layout-ncol="2"}
::: {style="text-align: center;"}
### [Get Started](/get-started/index.qmd)
Welcome new users! Start learning how to install `barRoso`
:::
::: {style="text-align: center;"}
### [Articles](/articles/index.qmd)
"How-to" articles to help you learn how to use `barRoso`
:::
:::
:::
:::
:::