Produces BayesTraits phylogenetic regression mean and linked input data comprising samples of trait data
Source:R/phyreg_inputs.R
phyreg.inputs.Rd
This script produces BayesTraits input data that comprise samples
of trait data for phylogenetic regression analyses with Meade & Pagel's (2022) BayesTraits
program. The function generates two text files. One includes linked samples of
trait data and the other the mean values for samples of trait data. The function
also includes an argument to calculate the net node count for each species or
terminal taxon given a phylogenetic tree. Net nodes can serve as an explanatory
variable to explore the effect of speciation rates on the response variable
(e.g., O’Donovan et al. 2018).
See BayesTraits V4.0.0 Manual
for further details on the specific format of the input files that are required
by BayesTraits. See also a more complete article on how to automatically
create input files
with phyreg.inputs
for the phylogenetic regression analyses.
Usage
phyreg.inputs(tree,
data,
tipscol = NULL,
NodeCount = FALSE,
logtransf = NULL,
sqrtransf = NULL,
traitcols = NULL,
addtraits = NULL,
ordtraits = NULL,
dir_create = "results_BayesTraits_phyreg_input",
fileDistData = "BayesTraits_linked_input.txt",
fileMeanData = "BayesTraits_mean_input.txt",
fileOrigData = NULL)
Arguments
- tree
The input tree file. This can be either a Newick-formatted tree imported by
ape
's function read.tree or a treedata-formatted phylogeny as generated by BEAST and read bytreeio
's function read.beast.- data
The input sample of trait data in data frame format. A column is included that links multiple trait values for each terminal branch label of the input phylogeny (i.e., each species or accession).
- tipscol
Report the name of the column with terminal or tip labels to be included in the input file containing the mean and linked samples of trait data. Terminal labels on the phylogeny tree have to match row labels in the data files containing mean and linked samples of trait data.
- NodeCount
Logical, the default is
FALSE
. The function will create a new column NodeCount, or net nodes, for each terminal label. Net nodes or node count will be reported in the mean of the samples of trait data and will be replicated for every accession of each terminal label in the linked samples of trait data.- logtransf
A vector with the name of the column(s) in the original trait data to be log10-transformed. For example, if you report the columns c("DBH", "NodeCount") to be log10-transformed, the function will generate log10DBH and logNodeCount for the BayesTraits input data files.
- sqrtransf
A vector with the name of the column(s) in the original trait data to be log10-transformed squared. For example, if you define Nodecount to be log10-transformed squared because speciation rates may have recently slowed, the function will generate sqrNodeCount for the BayesTraits input data containing samples of traits. If sqrNodeCount is to be included in an analysis, list it as argument in
addtraits
(see below).- traitcols
A vector with the name of the column(s) in the original trait data that will serve as the response or dependent variable (first column name) and the explanatory or independent variable(s) (second and possibly a third and fourth column) for the phylogenetic regression analysis.
- addtraits
A vector with the name of additional column(s) that are not originally in the input trait data but for which you want to generate inputs of linked samples of trait data and mean data. For example, if you have chosen
NodeCount=TRUE
and defined any new columns to be log10-transformed or log10-transformed squared, they should be listed here as additional trait data to be included in the linked sample of traits.- ordtraits
A vector listing the order of trait names to be used in the BayesTraits analysis. This ensures that the first trait listed will serve as the response variable, and that subsequently listed traits serve as explanatory variables. Maintaining a consistent order of the names of the explanatory variables among competing nested and non-nested models will facilitate the interpretation of coefficient estimates reported for Beta1, Beta2, etc. in the BayesTraits log files.
- dir_create
Path to directory where results will be saved. The default creates a directory named results_BayesTraits_phyreg_input and the results will be saved within a subfolder of that directory named with the current date.
- fileDistData
Name of the resulting text file containing linked samples of trait data. If no name is given, the default setting creates a file named BayesTraits_linked_input.txt.
- fileMeanData
Name of the resulting text file containing mean trait data. If no name is given, the default setting creates a file named BayesTraits_mean_input.txt.
- fileOrigData
If you define any name here, the originally imported file of trait data will be saved as a CSV file, including any newly generated data columns. This new file of trait data will be written to the directory as set in the argument
dir_create
, but outside the current-date subfolder.
Examples
if (FALSE) {
library(InNOutBT)
## Loading a example data file
data(vatsData)
## Loading a example tree file
data(vatsTree)
phyreg.inputs(tree = vatsTree,
data = vatsData,
tipscol = "terminal",
NodeCount = TRUE,
logtransf = c("DBH", "NodeCount"),
traitcols = c("bio12", "bio15"),
addtraits = c("log10NodeCount", "log10DBH"),
ordtraits = c("log10DBH", "bio12", "bio15", "log10NodeCount"),
dir_create = "results_BayesTraits_phyreg_input",
fileDistData = "BayesTraits_linked_data_bio12_bio15_nnodes.txt",
fileMeanData = "BayesTraits_mean_data_bio12_bio15_nnodes.txt",
fileOrigData = "vataireoids_1610_25May2022_BayesTraits_netnodes_logtransf.csv")
}