A step-by-step tutorial

Published

August 4, 2025

This online tutorial belongs to the paper “Promoting the use of phylogenetic multinomial generalised mixed-effects model to understand the evolution of discrete traits.” We are still updating the tutorial more reader-friendly - keep checking for the latest version!

Introduction

This tutorial provides a step-by-step guide to model different types of variables using MCMCglmm and brms. We show and explain how to implement 1. Gaussian, 2. binary, 3. ordinal, and 4. nominal models and interpret obtain results. We will also provide extensions: models with multiple data points per species and models with simulated data (simulation is powerful tool to understand the behaviour of the model and the effect of the prior distribution on the results!)

Why use phylogenetic comparative methods?

Phylogenetic comparative methods are used to consider the non-independence of species relatedness due to shared evolutionary history. If we ignore the phylogenetic relationship among species, we may violate the assumption of independence and underestimate the standard errors of the estimated parameters. These methods are widely used in evolutionary biology and ecology to test hypotheses about the evolution of traits and behaviours, reconstruct common ancestral traits, and investigate how different traits have co-evolved across species.

Why use MCMCglmm and brms in this tutorial?

MCMCglmm and brms allow us to make hierarchical and non-linear models. We can create more appropriate models that accurately reflect the underlying relationships within the data. Additionally, both packages can handle various types of response variables, including continuous, Poisson, binary, ordinal, and nominal data. Using Bayesian methods for parameter estimation, MCMCglmm and brms can incorporate prior distributions, which is particularly beneficial when working with limited data.

Intended audience

Our tutorial is designed for students and researchers who are familiar with using R language (at least know what is R!) and have an interest in using phylogenetic comparative methods. Although the examples in our tutorial focus on questions and data from Behavioural Ecology and Evolutionary Biology, we provide detailed explanations. Therefore, we believe these techniques are versatile and can be applied in other scientific disciplines, such as environment science, psychology, and beyond.

Setup R in your computer

Please install the following packages before running the code: MCMCglmm, brms, ape, phytools, here, coda, MASS, tidyverse, arm. If this is the first time using brms, you need to install Rstan before installing brms. Please follow the instruction on the official website.

pacman::p_load("ape", "coda", "tidyverse", "here", 
              "MCMCglmm", "brms", "MASS", 
              "phytools",  "arm", "posterior")

Note 1: some models (especially nominal models) will take a long time to run. Thus, we recommend when you want to compare the results, you can save the model output as .rds file using saveRDS() and read it by readRDS() when you want to compare the results.

Note 2: Because of the stochastic nature of MCMC, every time you (re)run a model, you will get a slightly different output even the model was well-mixing and converge. So even if you run the same model using the same computer, it would always be different to whatever our output here.

Note 3: Ideally, the effective sample size of a model should be close to 1000. However, We consider a model’s results reliable if the effective sample size (ESS) exceeds 400, as suggested by Vehtari et al. (2021). According to their recommendation, an ESS greater than 400 is typically sufficient to ensure stable estimates of variances, autocorrelations, and Monte Carlo standard errors based on their practical experience and simulations (more detailed information; Vehtari et al. 2021). Here, qe evaluate comprehensively in line with other convergence diagnostic indicators, such as R hat and autocorrelation plots.

1. Gaussian models

Gaussian models are used when the response variable is continuous (numeric) and normally distributed. For example, age, body size (e.g. weight, length, or height of an organism), temperature, and distance.

Explanation of dataset

We use the rodent dataset and phylogenetic tree provided by Sheard et al. (2024) to explain how to model Gaussian models. The data is about the rodent suborder Sciuromorpha (223 species), which includes squirrels, chipmunks, dormice, and the mountain beaver. The dataset contains several information including the tail length, body mass, mean annual temperature, and the presence of a tail contrasting tip. The tail contrasting tip is a white or black tip at the end of the tail. In this section, we test the evolutionary relationships between tail length & body mass (response variables) and mean annual temperature and the presence of tail contrasting tip (predictor variables) in rodents.

First of all, we modify the original dataset.

# 1. Read dataset
dt <- read.csv(here("data", "potential", "Rodent_tail", "RodentData.csv")) #  Replace with your own folder path to where the data is stored

str(dt) # check dataset - replace table format!!

dt <- dt %>% rename(Phylo = UphamTreeName.full) # Rename the 'UphamTreeName.full' column to 'Phylo'
dt <- dt %>% filter(Suborder == "Sciuromorpha") # Filter the rodent suborder
dt <- subset(dt, !is.na(Mass)) # Remove the species do not have body mass data

# Tail length and body mass: log-transform to reduce skewness, then standardise (centre and scale) to have a mean of 0 and a standard deviation of 1, making it easier to compare with other variables.
# Temperature, shade score and litter size: standardise
dt$zLength <- scale(log(dt$Tail_length)) 
dt$zMass <- scale(log(dt$Mass))
dt$zShade<-scale(dt$shade_score)
dt$zTemp <- scale(dt$Mean_Annual_Temp)
dt$zLitter_size <- scale(dt$Litter_size)

# Visualise zLength distribution
# hist(dt$zLength)

# Also, convert some columns to factors for analysis just in case
dt <- dt %>%
  mutate(across(c(White.tips, Black.tips, Tufts, Naked, Fluffy, Contrasting, Noc, Di, Crep, Autotomy), as.factor))

# 2. Read phylogenetic tree
trees <- read.tree(here("data", "potential", "Rodent_tail", "RodentTrees.tre")) # 100 trees
tree <- trees[[1]] # In this time, we use only one tree

#Trim out everything from the tree that's not in the modified dataset
trees <- lapply(trees, drop.tip,tip=setdiff(tree$tip.label, dt$Phylo))

# Select one tree for trimming purposes
tree <- trees[[1]]
tree <- force.ultrametric(tree, method = c("extend")) # Convert non-ultrametric tree to ultrametric tree

How to model? How to interpret the output?

Let’s start by running the model. The first model we will fit is a simple animal model with no fixed effects, and only a random effect relating phylogeny. From here on, we will always begin with this simplest model to assess the phylogenetic signal of the response variable before adding more complexity by including one continuous and one categorical explanatory variable.

Making model is basically 3-4 steps:

  1. Get phylogenetic variance-covariance (MCMCglmm) / correlation (brms) matrix

  2. Set prior based on your model and run model with default iteration, thinning, and burn-in

  3. Check mixing whether the samples are adequately exploring the parameter space and convergence, posterior distribution of random and fixed effects

  4. If model do not show good mixing or converge…

    • change iteration, thinning, and burn-in
    • change prior setting

Univariate model

Intercept-only momdel

MCMCglmm

To get the variance-covariance matrix, we use the inverseA() function in MCMCglmm:

inv_phylo <- inverseA(tree, nodes = "ALL", scale = TRUE) # Calculate the inverse of the phylogenetic relatedness matrix for all nodes, scaling the results

Then, we will define the priors for the phylogenetic effect and the residual variance using the following code:

prior1 <- list(R = list(V = 1, nu = 0.002),  # Prior for residuals: weak inverse-Wishart prior for residual variance
               G = list(G1 = list(V = 1, nu = 0.002)))  # Prior for random effect: weak inverse-Wishart for random effect

we set the prior for the residual variance to be a scaled inverse-Wishart distribution with a scale parameter of 1 and degrees of freedom of 0.002. The prior for the phylogenetic effect is set to be a scaled inverse-Wishart distribution with a scale parameter of 1 and degrees of freedom of 0.002. This prior minimizes the influence of prior information, allowing the data to have a stronger impact on the model’s results. However, if you know your dataset characteristics, you can set a stronger (informative) prior.

In MCMCglmm, we can set four elements in prior: R (R-structure for residual), G (G-structure for random effect), B (fixed effects), and S (theta_scale parameter). Here, we explain B, R, G.

Meaning of parameters in MCMCglmm prior setting
Parameter Which elements Meaning Effect of increasing the value Effect of decreasing the value
V B, R, G Variance parameter for the prior distribution Increases variability in the prior, allowing more flexibility in parameter estimation. The model becomes more influenced by the data (= wider prior distribution) Decreases variability in the prior, leading to more constrained estimates. The model becomes more influenced by the prior (= narrower prior distribution)
mu B Mean of the prior distribution for fixed effects Shifts the centre of the fixed effect estimates towards a larger value Shifts the centre of the fixed effect estimates towards a smaller value
nu R, G Degrees of freedom for the inverse Wishart distribution. Controls the “certainty” of the prior. Larger values result in smaller variance of the prior (higher certainty) The prior becomes narrower, tending to decrease the variability of the variance-covariance matrix estimates. Stronger prior information is given The prior becomes wider, tending to increase the variability of the variance-covariance matrix estimates. Weaker prior information is given. nu = 0.002 is often used
fix R, G If fix = 1, the variance is fixed, and the specified prior value (V) is used. No estimation of variance is performed NA NA
alpha.mu R, G Prior mean for the variance-covariance matrix. Specifies the central matrix for the variance-covariance matrix estimates. The variance-covariance matrix estimates tend to approach the matrix specified by alpha.mu. The variance-covariance matrix estimates tend to approach the matrix specified by alpha.mu (influences each element in the decreasing direction).
alpha.V R, G Prior variance for the variance-covariance matrix. Controls the variability of the variance-covariance matrix estimates. The variability of the variance-covariance matrix estimates tends to increase. The variability of the variance-covariance matrix estimates tends to decrease.

Note that the G element is for random effects variance (G1, G2, G3…). That means if you have 3 random effects, you need to define 3 priors in G.

In MCMCglmm, the prior setting allows the model to account for different variance structures by dividing the variance components into two distinct matrices: G (phylogeny) and R (non-phylogeny, i.e., residual variance). From this setup, we can estimate the phylogenetic heritability, also known as the phylogenetic signal (Pagel’s \(\lambda\)), using the following simple equation:

\[ H^2 = \frac{\sigma_{a}^2}{\sigma_{a}^2 + \sigma_{e}^2} \]

When we do not include any explanatory variables (= intercept-only model), the model is added ~ 1. We set a bit large nitt, thin, and burnin to improve mixing and effective sample sizes than default settings (nitt = 13000, thin = 10, burnin = 3000).

# You can check model running time by system.time() function. Knowing the running time is important when you try and error to find the best model setting!

system.time(
  mcmcglmm_mg1 <- MCMCglmm(zLength ~ 1,             # Response variable zLength with an intercept-only model
                           random = ~ Phylo,        # Random effect for Phylo
                           ginverse = list(Phylo = inv_phylo$Ainv),  # Specifies the inverse phylogenetic covariance matrix Ainv for Phylo, which accounts for phylogenetic relationships.
                           prior = prior1,          # Prior distributions for the model parameters
                           family = "gaussian",     # Family of the response variable (Gaussian for continuous data)
                           data = dt,               # Dataset used for fitting the model
                           nitt = 13000*20,       # Total number of iterations (multiplied by 20 to match scale)
                           thin = 10*20,          # Thinning rate (multiplied by 20)
                           burnin = 3000*20       # Number of burn-in iterations (multiplied by 20)
                           ) 
            )
summary(mcmcglmm_mg1)

To begin with, we should check mixing and model convergence. We can check them via trace plots and effective sample size. The plot should look like a fuzzy caterpillar to make sure your model has converged.

plot(mcmcglmm_mg1$VCV) # Visualise variance component 

plot(mcmcglmm_mg1$Sol) # Visualise location effects

autocorr.plot(mcmcglmm_mg1$VCV) # Check chain mixing

autocorr.plot(mcmcglmm_mg1$Sol) # Check chain mixing

summary(mcmcglmm_mg1) # See the result 95%HPD

 Iterations = 60001:259801
 Thinning interval  = 200
 Sample size  = 1000 

 DIC: -26.20546 

 G-structure:  ~Phylo

      post.mean l-95% CI u-95% CI eff.samp
Phylo     1.514    1.118    1.968     1000

 R-structure:  ~units

      post.mean l-95% CI u-95% CI eff.samp
units   0.02785  0.00892  0.04964     1000

 Location effects: zLength ~ 1 

            post.mean l-95% CI u-95% CI eff.samp pMCMC
(Intercept)   -0.1778  -1.3683   1.0638     1000 0.822
summary(mcmcglmm_mg1$VCV) # See the result of 95% Credible Interval of the variance component

Iterations = 60001:259801
Thinning interval = 200 
Number of chains = 1 
Sample size per chain = 1000 

1. Empirical mean and standard deviation for each variable,
   plus standard error of the mean:

         Mean      SD Naive SE Time-series SE
Phylo 1.51394 0.22740 0.007191       0.007191
units 0.02785 0.01066 0.000337       0.000337

2. Quantiles for each variable:

         2.5%     25%     50%     75%  97.5%
Phylo 1.11828 1.34700 1.49706 1.66954 1.9651
units 0.01027 0.01996 0.02735 0.03436 0.0519
summary(mcmcglmm_mg1$Sol) # See the result of 95% Credible Interval of the location effects

Iterations = 60001:259801
Thinning interval = 200 
Number of chains = 1 
Sample size per chain = 1000 

1. Empirical mean and standard deviation for each variable,
   plus standard error of the mean:

          Mean             SD       Naive SE Time-series SE 
      -0.17781        0.63444        0.02006        0.02006 

2. Quantiles for each variable:

   2.5%     25%     50%     75%   97.5% 
-1.4874 -0.6218 -0.1435  0.2346  0.9961 

It looks like both estimates mixed well. And the effective sample size (eff.samp) is large enough. The summary output shows the posterior mean, standard deviation, and 95% credible interval of the fixed (Location effects) and random effects (G-structure) and also residuals (R-structure).

brms

The syntax is similar to the packages lme4 and lmer in R. One of the advantages of brms is that it can fit complex models with multiple random effects and non-linear terms. The model fitting process is similar to MCMCglmm but with a different syntax. The output also differs between the two packages.

In brms, We can use the default_prior(or get_prior) function to set prior for the model. This function automatically make a weakly informative prior, which is suitable for most models (more detailed information is here[url]). Of course, you can also set your own prior if you are familiar with your datasets and want to set more informative one (see the section 4).

Things to keep in your mind are to use vcv.phylo(phylogenetic_tree, corr = TRUE) function to get the phylogenetic correlation matrix. brms vignettes use vcv.phylo(phylogenetic_tree), but it looks like incorrect. You need to specify corr = TRUE. Because the default is corr = FALSE, which means the function returns the phylogenetic variance-covariance matrix, not the correlation matrix.

We set large iter and warmup to improve mixing and effective sample sizes than default settings (iter = 2000, warmup = 1000). Also, we set control = list(adapt_delta = 0.95) to control parameter to avoid divergent transitions. Setting adapt_delta closer to 1 increases the step size during sampling to improve model stability.

By default, Stan runs four independent Markov chains (chain = 4), each exploring the posterior distribution. But four chains are rarely required; two are enough (chain = 2).

A <- ape::vcv.phylo(tree, corr = TRUE) # Get the phylogenetic correlation matrix

In brms, we can also check the Rhat value to see if the model has converged. The Rhat value should be close to 1.0.

plot(brms_mg1) # Visualise effects

summary(brms_mg1) # See the result
 Family: gaussian 
  Links: mu = identity; sigma = identity 
Formula: zLength ~ 1 + (1 | gr(Phylo, cov = A)) 
   Data: dt (Number of observations: 300) 
  Draws: 2 chains, each with iter = 10000; warmup = 5000; thin = 1;
         total post-warmup draws = 10000

Multilevel Hyperparameters:
~Phylo (Number of levels: 300) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     1.09      0.07     0.95     1.24 1.00     1156     2572

Regression Coefficients:
          Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept    -0.10      0.54    -1.16     0.97 1.00     2228     3739

Further Distributional Parameters:
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma     0.22      0.02     0.18     0.26 1.00     1091     2257

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

To check the consistency between the results obtained from MCMCglmm and brms, we compare the mean of posterior distribution of the fixed and random effect estimates. You should note that the output of random effect estimates and residuals are presented in slightly different formats between MCMCglmm and brms. In MCMCglmm, the outputs (G-structure / R-structure) are provided as variance, while in brms, the outputs (Multilevel Hyperparameters / Further Distributional Parameters) are given as standard deviation. Therefore, when comparing these estimates between the two packages, you need to square the values obtained from brms.

As you can see below, the MCMCglmm and brms estimates do not exactly match. However, the values are close, and the overall trend is consistent. This suggests that the outcomes obtained from either approach remain largely the same. Thus, even with some differences in the estimates, the broader results and patterns can be interpreted similarly. *The difference comes from factors such as each package’s sampling algorithms and the prior setting.

In this model, the phylogenetic signal is quite high - over 0.95. This means that 95% of the variance in tail length is explained by the phylogenetic relationship among species. This is a strong phylogenetic signal, suggesting that closely related species tend to have similar tail lengths!

Intercept
summary(mcmcglmm_mg1$Sol) # MCMCglmm

Iterations = 60001:259801
Thinning interval = 200 
Number of chains = 1 
Sample size per chain = 1000 

1. Empirical mean and standard deviation for each variable,
   plus standard error of the mean:

          Mean             SD       Naive SE Time-series SE 
      -0.17781        0.63444        0.02006        0.02006 

2. Quantiles for each variable:

   2.5%     25%     50%     75%   97.5% 
-1.4874 -0.6218 -0.1435  0.2346  0.9961 
draws_df <- as_draws_df(brms_mg1) # Convert brms object to data frame
summary(draws_df$b_Intercept) # brms
    Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
-2.21282 -0.46570 -0.09844 -0.10152  0.26306  1.84947 
Random effect and phylogenetic signals (and also residuals)
# random effect
mean(mcmcglmm_mg1$VCV[, "Phylo"]) # MCMCglmm(g-structure)
[1] 1.513938
mean(draws_df$sd_Phylo__Intercept)^2 # brms(Multilevel Hyperparameters)
[1] 1.179172
# residuals
mean(mcmcglmm_mg1$VCV[, "units"]) # MCMCglmm(r-structure)
[1] 0.02785073
mean(draws_df$sigma)^2 # brms(Further Distributional Parameters)
[1] 0.04712953
# phylogenetic signals
phylo_signal_mcmcglmm <- mean(mcmcglmm_mg1$VCV[, "Phylo"]) / (mean(mcmcglmm_mg1$VCV[, "Phylo"]) + mean(mcmcglmm_mg1$VCV[, "units"]))

phylo_signal_brms <- mean(draws_df$sd_Phylo__Intercept)^2 / (mean(draws_df$sd_Phylo__Intercept)^2 + mean(draws_df$sigma)^2)

phylo_signal_mcmcglmm
[1] 0.9819361
phylo_signal_brms
[1] 0.9615677

One continuous explanatory variable model

Note

We know that we should control for the relationships between body size and tail length, but we will ignore it for now…

Then, we will add one continuous explanatory variable, zTemp, to the model. In this model, we test whether we can obtain results consistent with previous findings that species with longer tails tend to live in warmer environments.

MCMCglmm
system.time(
  mcmcglmm_mg2 <- MCMCglmm(zLength ~ zTemp, 
                           random = ~ Phylo, 
                           ginverse = list(Phylo = inv_phylo$Ainv), 
                           prior = prior1, 
                           family = "gaussian", 
                           data = dt,
                           nitt = 13000*25,
                           thin = 10*25,
                           burnin = 3000*25
                           ) 
            )
summary(mcmcglmm_mg2)

 Iterations = 75001:324751
 Thinning interval  = 250
 Sample size  = 1000 

 DIC: -5.777706 

 G-structure:  ~Phylo

      post.mean l-95% CI u-95% CI eff.samp
Phylo     1.464    1.065    1.912     1000

 R-structure:  ~units

      post.mean l-95% CI u-95% CI eff.samp
units   0.03053 0.009984  0.05334     1000

 Location effects: zLength ~ zTemp 

            post.mean l-95% CI u-95% CI eff.samp pMCMC
(Intercept)  -0.17061 -1.48930  0.99052     1000 0.768
zTemp         0.03310 -0.05767  0.11829     1000 0.482
posterior_summary(mcmcglmm_mg2$VCV)
        Estimate Est.Error       Q2.5    Q97.5
Phylo 1.46421444 0.2224519 1.06723218 1.912723
units 0.03052545 0.0113458 0.01195931 0.057253
posterior_summary(mcmcglmm_mg2$Sol)
              Estimate  Est.Error       Q2.5     Q97.5
(Intercept) -0.1706142 0.63514628 -1.4145991 1.1523617
zTemp        0.0331045 0.04637504 -0.0522391 0.1252606

The fixed effects estimate for the relationship between tail length and standardized temperature is 0.0331 (posterior mean), with a 95% HPD interval of [-0.05, 0.11]. The pMCMC value is 0.482, indicating that the effect of standardised temperature is not statistically supported under this model. The intercept, representing the expected tail length when standardised temperature is zero, has a posterior mean of -0.171, with a 95% credible interval of [-1.42, 1.15], also suggesting no strong evidence for deviation from zero.

The 95% HPD (Highest Posterior Density interval) from MCMCglmm and the 95% CI ( 95% equal-tailed credible interval) from brms serve a similar purpose, but they are constructed differently and can yield different intervals, especially in asymmetric posterior distributions.

Highest Posterior Density (HPD) Interval:

The HPD interval is the narrowest interval that contains a specified proportion (e.g., 95%) of the posterior distribution. It includes the most probable values of the parameter, ensuring that every point inside the interval has a higher posterior density than any point outside.

Equal-Tailed Credible Interval (CI):

An equal-tailed credible interval includes the central 95% of the posterior distribution, leaving equal probabilities (2.5%) in each tail. This means that there’s a 2.5% chance that the parameter is below the interval and a 2.5% chance that it’s above.

https://search.r-project.org/CRAN/refmans/HDInterval/html/hdi.html

brms
default_priors2 <- default_prior(
                         zLength ~ zTemp + (1|gr(Phylo, cov = A)), 
                         data = dt, 
                         data2 = list(A = A),
                         family = gaussian()
                         )

system.time(
  brms_mg2 <- brm(zLength ~ zTemp + (1|gr(Phylo, cov = A)),
                  data = dt, 
                  data2 = list(A = A),
                  family = gaussian(), 
                  prior = default_priors2,
                  iter = 10000, 
                  warmup = 5000, 
                  thin = 1,
                  chain = 2,
                  control = list(adapt_delta = 0.95),
                  core = 2,
                  threads = threading(5)
                )
             )
summary(brms_mg2)
 Family: gaussian 
  Links: mu = identity; sigma = identity 
Formula: zLength ~ zTemp + (1 | gr(Phylo, cov = A)) 
   Data: dt (Number of observations: 223) 
  Draws: 2 chains, each with iter = 10000; warmup = 5000; thin = 1;
         total post-warmup draws = 10000

Multilevel Hyperparameters:
~Phylo (Number of levels: 223) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     1.20      0.09     1.03     1.38 1.00      864     1701

Regression Coefficients:
          Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept    -0.15      0.60    -1.33     1.03 1.00     1900     3128
zTemp         0.04      0.05    -0.05     0.13 1.00     1961     4758

Further Distributional Parameters:
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma     0.18      0.03     0.12     0.24 1.00      537      923

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

The posterior estimate for the effect of zTemp on zLength is 0.04, with a 95% credible interval of [-0.05, 0.13]. The uncertainty surrounding this effect suggests that the relationship between standardised temperature and tail length is weak and not statistically significant. The intercept has a posterior estimate of -0.15 with a 95% credible interval of [-1.33, 1.03], representing the average tail length when zTemp = 0, although this is also uncertain.

The results of MCMCglmm and brms are generally consistent, with a few differences. The estimates and credible intervals for the fixed effect (zTemp) are almost identical, both indicating a weak relationship. These differences mainly arise from the computational methods and output formats of the models but do not affect the conclusions.

One continuous and one categorical explanatory variable model

Finally, we add categorical explanatory variable, Contrasting, to the above model. In this model, we test whether species with contrasting tail tips have longer tails than those without contrasting tail tips, after controlling for the effect of temperature.

MCMCglmm
system.time(
  mcmcglmm_mg3 <- MCMCglmm(zLength ~ zTemp + Contrasting, 
                           random = ~ Phylo, 
                           ginverse = list(Phylo = inv_phylo$Ainv), 
                           prior = prior1, 
                           family = "gaussian", 
                           data = dt,
                           nitt=13000*30,
                           thin=10*30,
                           burnin=3000*30
                           ) 
            )
summary(mcmcglmm_mg3)

 Iterations = 90001:389701
 Thinning interval  = 300
 Sample size  = 1000 

 DIC: 87.21227 

 G-structure:  ~Phylo

      post.mean l-95% CI u-95% CI eff.samp
Phylo      1.19   0.9022      1.5     1113

 R-structure:  ~units

      post.mean l-95% CI u-95% CI eff.samp
units   0.04284  0.02576  0.05855     1168

 Location effects: zLength ~ zTemp + Contrasting 

             post.mean  l-95% CI  u-95% CI eff.samp pMCMC   
(Intercept)  -0.184003 -1.362926  0.869770     1000 0.740   
zTemp         0.082215 -0.006456  0.155338     1000 0.050 . 
Contrasting1  0.163943  0.059351  0.290087     1000 0.004 **
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
posterior_summary(mcmcglmm_mg3$VCV)
        Estimate  Est.Error       Q2.5      Q97.5
Phylo 1.19045414 0.16451779 0.90569889 1.51171543
units 0.04283667 0.00864498 0.02680094 0.06011227
posterior_summary(mcmcglmm_mg3$Sol)
                Estimate Est.Error         Q2.5     Q97.5
(Intercept)  -0.18400256 0.5840963 -1.262384395 1.0278077
zTemp         0.08221542 0.0420212  0.001294616 0.1679503
Contrasting1  0.16394271 0.0581162  0.036145589 0.2736073

Both temperature (\(zTemp\)) and contrasting patterns appear to influence tail length. Contrasting has a stronger effect than temperature

brms
default_priors3 <- default_prior(
                         zLength ~ zTemp + Contrasting + (1|gr(Phylo, cov = A)), 
                         data = dt, 
                         data2 = list(A = A),
                         family = gaussian()
                         )

system.time(
  brms_mg3 <- brm(zLength ~ zTemp + Contrasting + (1|gr(Phylo, cov = A)),
                  data = dt, 
                  data2 = list(A = A),
                  family = gaussian(), 
                  prior = default_priors3,
                  iter = 10000, 
                  warmup = 5000, 
                  thin = 1,
                  chain = 2,
                  control = list(adapt_delta = 0.95),
                  core = 2,
                  threads = threading(5)
                )
             )
summary(brms_mg3)
 Family: gaussian 
  Links: mu = identity; sigma = identity 
Formula: zLength ~ zTemp + Contrasting + (1 | gr(Phylo, cov = A)) 
   Data: dt (Number of observations: 223) 
  Draws: 2 chains, each with iter = 10000; warmup = 5000; thin = 1;
         total post-warmup draws = 10000

Multilevel Hyperparameters:
~Phylo (Number of levels: 223) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     1.23      0.09     1.06     1.42 1.01      488      945

Regression Coefficients:
             Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept       -0.16      0.62    -1.38     1.04 1.00     1728     3132
zTemp            0.03      0.05    -0.05     0.12 1.00     1447     2230
Contrasting1     0.15      0.06     0.02     0.27 1.00     2826     4736

Further Distributional Parameters:
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma     0.16      0.03     0.09     0.22 1.01      295      286

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

In brms, \(Contrasting\) has a positive effect on tail length. The posterior estimate is 0.15, with a 95% credible interval of [0.03, 0.27]. This suggests that species with contrasting tail tips have longer tails than those without contrasting tail tips, after controlling for the effect of temperature. The effect of temperature on tail length is similar to the previous model, with a posterior estimate of 0.03 and a 95% credible interval of [-0.05, 0.13].

Although there is some overlap in the estimates of temperature’s effect between the two models, the key difference is that MCMCglmm provides a positive effect with stronger evidence (CI not including zero), whereas brms shows more uncertainty (CI including zero). This suggests that while both models suggest a potential effect, brms does not provide as clear evidence for it being significant, and the effect size might be smaller or potentially null.

Bivariate model

Here, we provide more advanced models - bivariate models. Because the bivariate model is similar to the nominal model, it allows for examining relationships between two response variables while taking into account multiple explanatory variables. In this model, we will include two response variables, zLength and zMass, and test the relationship between these two variables and the explanatory variables, zTemp and Contrasting. This model can examine the evolutionary correlation between tail length and body mass in rodents, considering the phylogenetic relationships.

In MCMCglmm, we use cbind(traitA, traitB) to set the response variables (this allows us to estimate the covariance between the two response variables). In brms, we use mvbind(traitA, traitB) to set the response variable. We also need to specify family for both response variables (unless you use a custom prior, you need not define what distribution fit to the random effect in brms- default prior automatically set them).

The important point is, MCMCglmm does not provide the correlation estimate between the two response variables (it estimate the covariance between two response variables), while brms does. However, you need to use (1|a|gr(Phylo, cov = A)) instead of (1|gr(Phylo, cov = A)) and set set_rescor(TRUE). Without the newly added |a| and set_rescor(TRUE), brms will not calculate the correlation (a can be any letter). We need to calculate the correlation using covariance estimates in MCMCglmm. Fourtunately, the correlation can calculate easily by dividing the covariance by the product of the standard deviations of the two variables.

inv_phylo <- inverseA(tree, nodes = "ALL", scale = TRUE)
prior2 <- list(G = list(G1 = list(V = diag(2), 
                                  nu = 2, alpha.mu = rep(0, 2), 
                                  alpha.V = diag(2) * 1000)),
               R = list(V = diag(2), nu = 0.002)
              )

system.time(
        mcmcglmm_mg4 <- MCMCglmm(cbind(zLength, zMass) ~ trait - 1,
                        random = ~ us(trait):Phylo,
                       rcov = ~ us(trait):units,
                       family = c("gaussian", "gaussian"),
                       ginv = list(Phylo = inv_phylo$Ainv),
                       data = dt, 
                       prior = prior2,
                       nitt = 13000*55,
                       thin = 10*55,
                      burnin = 3000*55
                                )
        )


system.time(
        mcmcglmm_mg5 <- MCMCglmm(cbind(zLength, zMass) ~ zTemp:trait + trait - 1,
                        random = ~ us(trait):Phylo,
                        rcov = ~ us(trait):units,
                        family = c("gaussian", "gaussian"),
                        ginv = list(Phylo = inv_phylo$Ainv),
                        data = dt, 
                        prior = prior2,
                        nitt = 13000*55,
                        thin = 10*55,
                        burnin = 3000*55
                        )
        )

system.time(
        mcmcglmm_mg6 <- MCMCglmm(cbind(zLength, zMass) ~ zTemp:trait + Contrasting:trait + trait -1,
                        random = ~ us(trait):Phylo,
                        rcov = ~ us(trait):units,
                        family = c("gaussian", "gaussian"),
                        ginv = list(Phylo = inv_phylo$Ainv),
                        data = dt, 
                        prior = prior2,
                        nitt = 13000*55,
                        thin = 10*55,
                        burnin = 3000*55
                        )
        )


## brms
A <- ape::vcv.phylo(tree, corr = TRUE)

formula <- bf(mvbind(zLength, zMass) ~ 1 + 
               (1|a|gr(Phylo, cov = A)), 
              set_rescor(rescor = TRUE)
              )

default_prior <- default_prior(formula, 
                               data = dt, 
                               data2 = list(A = A),
                               family = gaussian()
                               )

system.time(
  brms_mg4 <- brm(formula = formula, 
                data = dt, 
                data2 = list(A = A),
                family = gaussian(), 
                prior = default_prior,
                iter = 25000, 
                warmup = 10000, 
                thin = 1,
                chain = 2,
                control = list(adapt_delta = 0.99),
                )
             )

formula2 <- bf(mvbind(zLength, zMass) ~ zTemp + (1|a|gr(Phylo, cov = A)), set_rescor(TRUE))

default_prior2 <- default_prior(formula2, 
                               data = dt, 
                               data2 = list(A = A),
                               family = gaussian()
                               )

system.time(
  brms_mg5_1 <- brm(formula = formula2, 
                data = dt, 
                data2 = list(A = A),
                family = gaussian(), 
                prior = default_prior,
                iter = 55000, 
                warmup = 45000, 
                thin = 1,
                chain = 2,
                control = list(adapt_delta = 0.99),
                )
             )

formula3 <- bf(mvbind(zLength, zMass) ~ zTemp + Contrasting + (1|a|gr(Phylo, cov = A)), set_rescor(TRUE))

default_prior3 <- default_prior(formula3, 
                               data = dt, 
                               data2 = list(A = A),
                               family = gaussian()
                               )

system.time(
   brms_mg6 <- brm(formula = formula3,  
                data = dt, 
                data2 = list(A = A),
                family = gaussian(), 
                prior = default_prior3,
                iter = 25000, 
                warmup = 20000, 
                thin = 1,
                chain = 2,
                control = list(adapt_delta = 0.98),
                )
              )
summary(mcmcglmm_mg4)

 Iterations = 165001:714451
 Thinning interval  = 550
 Sample size  = 1000 

 DIC: 56.02311 

 G-structure:  ~us(trait):Phylo

                                post.mean l-95% CI u-95% CI eff.samp
traitzLength:traitzLength.Phylo    1.8450   1.3355   2.4126     1000
traitzMass:traitzLength.Phylo     -0.1288  -0.5302   0.2168     1000
traitzLength:traitzMass.Phylo     -0.1288  -0.5302   0.2168     1000
traitzMass:traitzMass.Phylo        2.1016   1.5978   2.6773     1120

 R-structure:  ~us(trait):units

                                post.mean  l-95% CI u-95% CI eff.samp
traitzLength:traitzLength.units  0.024426  0.006637  0.04178     1000
traitzMass:traitzLength.units    0.005761 -0.008694  0.02082     1000
traitzLength:traitzMass.units    0.005761 -0.008694  0.02082     1000
traitzMass:traitzMass.units      0.057850  0.033641  0.08572     1000

 Location effects: cbind(zLength, zMass) ~ trait - 1 

             post.mean l-95% CI u-95% CI eff.samp pMCMC
traitzLength  -0.04079 -2.04401  1.70436     1000 0.968
traitzMass    -0.44244 -2.33966  1.55863     1000 0.688
posterior_summary(mcmcglmm_mg4$VCV)
                                    Estimate   Est.Error         Q2.5
traitzLength:traitzLength.Phylo  1.845005322 0.272081110  1.347508423
traitzMass:traitzLength.Phylo   -0.128783301 0.192333313 -0.517573376
traitzLength:traitzMass.Phylo   -0.128783301 0.192333313 -0.517573376
traitzMass:traitzMass.Phylo      2.101601613 0.288665978  1.624968122
traitzLength:traitzLength.units  0.024426281 0.009195449  0.008110895
traitzMass:traitzLength.units    0.005761157 0.007496943 -0.008150784
traitzLength:traitzMass.units    0.005761157 0.007496943 -0.008150784
traitzMass:traitzMass.units      0.057849736 0.013834738  0.036274710
                                     Q97.5
traitzLength:traitzLength.Phylo 2.47378715
traitzMass:traitzLength.Phylo   0.23065613
traitzLength:traitzMass.Phylo   0.23065613
traitzMass:traitzMass.Phylo     2.74203079
traitzLength:traitzLength.units 0.04557695
traitzMass:traitzLength.units   0.02198074
traitzLength:traitzMass.units   0.02198074
traitzMass:traitzMass.units     0.09066023
posterior_summary(mcmcglmm_mg4$Sol)
                Estimate Est.Error      Q2.5    Q97.5
traitzLength -0.04079478 0.9568184 -1.827382 2.059106
traitzMass   -0.44243620 1.0046735 -2.486688 1.513088
corr_p <- (mcmcglmm_mg4$VCV[, 2]) /sqrt((mcmcglmm_mg4$VCV[, 1] * mcmcglmm_mg4$VCV[, 4])) # calculate phylogenetic correlation between zLength and zMass
corr_nonp <-  (mcmcglmm_mg4$VCV[, 6]) /sqrt((mcmcglmm_mg4$VCV[, 5] * mcmcglmm_mg4$VCV[, 8])) 
posterior_summary(corr_p)
        Estimate  Est.Error       Q2.5     Q97.5
var1 -0.06650913 0.09770663 -0.2628186 0.1173509
posterior_summary(corr_nonp)
      Estimate Est.Error       Q2.5     Q97.5
var1 0.1527294 0.1890808 -0.2165917 0.5260998
summary(brms_mg4)
 Family: MV(gaussian, gaussian) 
  Links: mu = identity; sigma = identity
         mu = identity; sigma = identity 
Formula: zLength ~ 1 + (1 | a | gr(Phylo, cov = A)) 
         zMass ~ 1 + (1 | a | gr(Phylo, cov = A)) 
   Data: dt (Number of observations: 223) 
  Draws: 2 chains, each with iter = 35000; warmup = 25000; thin = 1;
         total post-warmup draws = 20000

Multilevel Hyperparameters:
~Phylo (Number of levels: 223) 
                                       Estimate Est.Error l-95% CI u-95% CI
sd(zLength_Intercept)                      1.22      0.09     1.05     1.40
sd(zMass_Intercept)                        1.29      0.09     1.12     1.48
cor(zLength_Intercept,zMass_Intercept)    -0.06      0.10    -0.26     0.13
                                       Rhat Bulk_ESS Tail_ESS
sd(zLength_Intercept)                  1.00     1048     2370
sd(zMass_Intercept)                    1.00     3079     6693
cor(zLength_Intercept,zMass_Intercept) 1.00     1912     4279

Regression Coefficients:
                  Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
zLength_Intercept    -0.14      0.61    -1.35     1.05 1.00     2159     4378
zMass_Intercept      -0.42      0.63    -1.67     0.83 1.00     3442     6223

Further Distributional Parameters:
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma_zLength     0.17      0.03     0.11     0.23 1.00      751     1004
sigma_zMass       0.25      0.03     0.20     0.31 1.00     2092     4823

Residual Correlations: 
                      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
rescor(zLength,zMass)     0.14      0.18    -0.22     0.47 1.00     2017
                      Tail_ESS
rescor(zLength,zMass)     4659

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

As you can see the above, the estimates from both models are consistent:

the random effects are…

zLength is 1.845 (1.348, 2.474) and zMass is 2.102 (1.625, 2.742)in MCMCglmm and 1.4884 (1.1025, 1.96) and 1.664 (1.2544, 2.1904) in brms. Phylogenetic correlation between zLength and zMass are -0.067 (-0.263, 0.117) in MCMCglmmand -0.06 (-0.26, 0.13)

The phylogenetic signal is

2. Binary models

Binary models are used when the response variable is binary (0 or 1), such as presence (1) vs. absence (0), survival (1) vs. death (0), success (1) vs. failure (0), or female vs. male. The results are straightforward to interpret: the model provides the probability of a ‘success’ (coded as 1) given the predictor variables. For example, in a female/male classification, the output represents the probability of being male, as R typically treats the alphabetically first category as the reference level.

Explanation of dataset

We tested the relationships between the presence of red–orange pelage on limbs (response variable) and average social group size and activity cycle in primates (265 species). We used the dataset and tree from Macdonald et al. (2024). The dataset contains information on the presence of skin and pelage coloration data, average social group size, presence or absence of multilevel hierarchical societies, colour visiual system, and activity cycle. We aim to test whether the presence of red–orange pelage on limbs is associated with social group size and activity cycle in primates.

Note

the presence of red–orange pelage on limbs:

0 = Absent. Species does not have red–orange pelage on limbs. 1 = Present. Species has red–orange pelage on limbs.

average social group size: Continuous variable.

activity cycle: Each species was categorized as nocturnal, cathemeral or diurnal.

For run model, we need to prepare the suitable format data.

p_trees <- read.nexus(here("data", "potential", "primate", "trees100m.nex"))
primate_data <- read.csv(here("data",  "potential", "primate", "primate_data_male.csv"))
p_dat <- primate_data

p_dat <- subset(p_dat, !is.na(vs_male))     # omit NAs by VS predictor
p_dat <- subset(p_dat, !is.na(vs_female))
p_dat <- subset(p_dat, !is.na(activity_cycle)) # omit NAs by activity cycle
p_dat <- subset(p_dat, !is.na(redpeachpink_facial_skin))  # omit NAs by response variable
p_dat <- subset(p_dat, !is.na(social_group_size))  # omit NAs by response variable
p_dat <- subset(p_dat, !is.na(multilevel))  # omit NAs by response variable

# Rename the column 'social_group_size' to 'cSocial_group_size' - all continuous variables were centred and standardized by authors
p_dat <- p_dat %>% rename(cSocial_group_size = social_group_size)

p_dat <- p_dat %>%
  mutate(across(c(vs_female, vs_male, redpeachpink_facial_skin,red_genitals,red_pelage_head, red_pelage_body_limbs, red_pelage_tail), as.factor))

p_tree <- p_trees[[1]]
p_trees <- lapply(p_trees, drop.tip,tip = setdiff(p_tree$tip.label, p_dat$PhyloName)) #trim out everything from the tree that's not in the dataset
p_tree <- p_trees[[1]] # select one tree for trimming purposes
p_tree <- force.ultrametric(p_tree) # force tree to be ultrametric - all tips equidistant from root

How to impliment models and interpret the outputs?

In MCMCglmm, the binary model can take two kinds of link functions: logit and probit link functions. The logit link function (family = "categorical") is the default in MCMCglmm for the binary model, while the probit link function can be defined using family = "threshold" or family = "ordinal". We recommend using family = "threshold" as it is often more suitable for modelling under biological assumption and can provide better convergence properties in certain situations. Here, we show both models.

Univariate model

Intercept-only model

Probit model
MCMCglmm

We need to set different prior for binary models from the Gaussian mopdel. Actually, the residual variance does not need to be estimated in binary models, so we only need to set the prior for the random effect. In binary models, the response variable y is either 0 or 1, which means there are no variety residuals to measure variability around a predicted mean. For logistic regression (logit link), the residual variance is fixed at ^2/3 on the latent scale. For probit regression (probit link), the residual variance is fixed at 1 on the latent scale. In MCMCglmm, the prior settings do not differ between the probit and logit models.

inv.phylo <- inverseA(p_tree, nodes = "ALL", scale = TRUE)

prior1 <- list(R = list(V = 1, fix = 1), # fix residual variance = 1
               G = list(G1 = list(V = 1, nu = 1, alpha.mu = 0, alpha.V = 10)
                       )
               )
system.time(
  mcmcglmm_BP1 <- MCMCglmm(red_pelage_body_limbs ~ 1, 
                            random = ~ PhyloName, 
                            family = "threshold", 
                            data = p_dat, 
                            prior = prior1, 
                            ginverse = list(PhyloName = inv.phylo$Ainv),
                            nitt = 13000*20, 
                            thin = 10*20, 
                            burnin = 3000*20)
                 )
summary(mcmcglmm_BP1) # 95%HPD Interval

 Iterations = 60001:259801
 Thinning interval  = 200
 Sample size  = 1000 

 DIC: 298.7917 

 G-structure:  ~PhyloName

          post.mean l-95% CI u-95% CI eff.samp
PhyloName     3.186   0.2795    7.242     1000

 R-structure:  ~units

      post.mean l-95% CI u-95% CI eff.samp
units         1        1        1        0

 Location effects: red_pelage_body_limbs ~ 1 

            post.mean l-95% CI u-95% CI eff.samp pMCMC
(Intercept)   -0.8487  -2.8595   0.8645     1093 0.296
posterior_summary(mcmcglmm_BP1$VCV) # 95%CI 
          Estimate Est.Error      Q2.5  Q97.5
PhyloName 3.185968  2.226905 0.6918403 8.6888
units     1.000000  0.000000 1.0000000 1.0000
posterior_summary(mcmcglmm_BP1$Sol)
              Estimate Est.Error      Q2.5     Q97.5
(Intercept) -0.8487221 0.8998161 -2.859676 0.8638048

As you can see here, residual variance is fixed 1 - the model did not estimate the residual variance (R-structure). The MCMC chain mixed well. The positive posterior mean of phylogenetic random effect (3.186) suggests that closely related species tend to have more red limb pelages.The fact that the 95% HPD interval / 95% equal-tailed credible interval does not include zero indicates that the effect of lineage on red pelages is statistically significant.

brms

family argument is set to bernoulli() (not binomial()). The bernoulli() family is used for binary data, while the binomial() family is used for the combination of count data, such as (c(N_sucsess, N_failure)).

A <- ape::vcv.phylo(p_tree, corr = TRUE)

priors_brms <- default_prior(red_pelage_body_limbs ~ 1 + (1 | gr(PhyloName, cov = A)), 
                             data = p_dat, 
                             data2 = list(A = A),
                             family = bernoulli(link = "probit"))

system.time(
  brms_BP1 <- brm(red_pelage_body_limbs ~ 1 + (1 | gr(PhyloName, cov = A)), 
                 data = p_dat, 
                 data2 = list(A = A),
                 family = bernoulli(link = "probit"),
                 prior = priors_brms,
                 iter = 6000, 
                 warmup = 5000, 
                 thin = 1,
                 chain = 2,
                 control = list(adapt_delta = 0.95),
  )
)
summary(brms_BP1)
 Family: bernoulli 
  Links: mu = probit 
Formula: red_pelage_body_limbs ~ 1 + (1 | gr(PhyloName, cov = A)) 
   Data: p_dat (Number of observations: 265) 
  Draws: 2 chains, each with iter = 6000; warmup = 5000; thin = 1;
         total post-warmup draws = 2000

Multilevel Hyperparameters:
~PhyloName (Number of levels: 265) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     1.67      0.53     0.78     2.85 1.00      393      727

Regression Coefficients:
          Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept    -0.72      0.81    -2.38     0.82 1.00      752     1027

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
# posterior_interval(brms_BP1)

The same trend in results was observed in brms. Note that the phylogenetic random effect estimate in brms is given as the standard deviation (sd), so to compare it with MCMCglmm, we need to square the sd. There were no differences between the fixed effects in brms and MCMCglmm. Outputs of brms is usually 95% equal-tailed credible interval.

Phylogenetic signals (probit model)
# MCMCglmm
phylo_signal_mcmcglmm_BP <- ((mcmcglmm_BP1$VCV[, "PhyloName"]) / (mcmcglmm_BP1$VCV[, "PhyloName"] + 1))
phylo_signal_mcmcglmm_BP %>% mean()
[1] 0.7052307
phylo_signal_mcmcglmm_BP %>% quantile(probs = c(0.025,0.5,0.975))
     2.5%       50%     97.5% 
0.4089277 0.7260425 0.8967865 
# brms
phylo_signal_brms_BP <- brms_BP1 %>% as_tibble() %>% 
        dplyr::select(Sigma_phy = sd_PhyloName__Intercept) %>% 
        mutate(lambda_probit = Sigma_phy^2 / (Sigma_phy^2 + 1)) %>% 
        pull(lambda_probit)

phylo_signal_brms_BP %>% mean()
[1] 0.6992207
phylo_signal_brms_BP %>% quantile(probs = c(0.025,0.5,0.975))
     2.5%       50%     97.5% 
0.3795359 0.7267670 0.8903665 

As expected from the random effects estimates above, the systematic signal is estimated to be 0.71 for MCMCMCglmm and 0.70 for brms. The difference with the estimation from Gaussian model is that the residuals are fixed at 1 in the binary model. That is the same for the other discrete models.

Logit model
MCMCglmm

c2 correction

In the logit model, we need to correct the estimates obtained from MCMCglmm as MCMCglmm implements additive overdispersion GLMM fitted by MCMC sampling from the posterior distribution. This overdispersion lead to inflated variance estimates (see also Nakagawa & Schielzeth (2010)). To adjust for this, we need to rescale the estimates of both location effects and variance components by considering the residual variance adjustment (MCMCglmm course note written by Jarrod Hadfield). We can use the below code:

c2 <- (16 * sqrt(3) / (15 * pi))^2
res_1 <- model$Sol / sqrt(1+c2) # for fixed effects
res_2 <- model$VCV / (1+c2) # for variance components

The logit model is a bit difficult to converge than the probit model. So, we set a bit large nitt, thin, and burnin to improve mixing and effective sample sizes than the probit model.

inv.phylo <- inverseA(p_tree, nodes = "ALL", scale = TRUE)
prior1 <- list(R = list(V = 1, fix = 1),
               G = list(G1 = list(V = 1, nu = 1, alpha.mu = 0, alpha.V = 10)
                       )
               )

system.time(
mcmcglmm_BL1 <- MCMCglmm(red_pelage_body_limbs ~ 1, 
                          random = ~ PhyloName, 
                          family = "categorical", 
                          data = p_dat, 
                          prior = prior1, 
                          ginverse = list(PhyloName = inv.phylo$Ainv),
                          nitt = 13000*60, 
                          thin = 10*60, 
                          burnin = 3000*60)
)
brms

In brms, we do not need to conduct c2 correction as brms automatically corrects the overdispersion.

A <- ape::vcv.phylo(p_tree, corr = TRUE)

priors_brms2 <- default_prior(red_pelage_body_limbs ~ 1 + (1 | gr(PhyloName, cov = A)), 
                             data = p_dat, 
                             data2 = list(A = A),
                             family = bernoulli(link = "logit"))

system.time(
  brms_BL1 <- brm(red_pelage_body_limbs ~ 1 + (1 | gr(PhyloName, cov = A)), 
                 data = p_dat, 
                 data2 = list(A = A),
                 family = bernoulli(link = "logit"),
                 prior = priors_brms2,
                 iter = 7500, 
                 warmup = 6500, 
                 thin = 1,
                 chain = 2,
                 control = list(adapt_delta = 0.95),
                 core = 2,
  )
)
# MCMCglmm
summary(mcmcglmm_BL1)

 Iterations = 180001:779401
 Thinning interval  = 600
 Sample size  = 1000 

 DIC: 294.3311 

 G-structure:  ~PhyloName

          post.mean l-95% CI u-95% CI eff.samp
PhyloName     11.64   0.7224    26.95     1000

 R-structure:  ~units

      post.mean l-95% CI u-95% CI eff.samp
units         1        1        1        0

 Location effects: red_pelage_body_limbs ~ 1 

            post.mean l-95% CI u-95% CI eff.samp pMCMC
(Intercept)    -1.665   -5.154    1.833     1000 0.316
c2 <- (16 * sqrt(3) / (15 * pi))^2
res_1 <- mcmcglmm_BL1$Sol/sqrt(1+c2)
res_2 <- mcmcglmm_BL1$VCV/(1+c2) 

posterior_summary(res_1)
             Estimate Est.Error      Q2.5    Q97.5
(Intercept) -1.435339  1.560119 -4.565329 1.502008
posterior_summary(res_2)
           Estimate Est.Error      Q2.5      Q97.5
PhyloName 8.6476147   6.01105 1.5836214 24.0221167
units     0.7430287   0.00000 0.7430287  0.7430287
#brms
summary(brms_BL1)
 Family: bernoulli 
  Links: mu = logit 
Formula: red_pelage_body_limbs ~ 1 + (1 | gr(PhyloName, cov = A)) 
   Data: p_dat (Number of observations: 265) 
  Draws: 2 chains, each with iter = 7500; warmup = 6500; thin = 1;
         total post-warmup draws = 2000

Multilevel Hyperparameters:
~PhyloName (Number of levels: 265) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     2.61      0.85     1.21     4.46 1.00      550      971

Regression Coefficients:
          Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept    -1.08      1.19    -3.54     1.13 1.00     1433     1398

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

The point estimates and the overlaps of 95%CI of both random and fixed effects look fine.

Phylogenetic signals (logit model)

Then, we can obtain the phylogenetic signal using

\[ H^2 = \frac{\sigma_{a}^2}{\sigma_{a}^2 + \pi^2/3} \]

# MCMCglmm
phylo_signal_mcmcglmm_BL <- ((mcmcglmm_BL1$VCV[, "PhyloName"]/(1+c2)) / (mcmcglmm_BL1$VCV[, "PhyloName"]/(1+c2)+1))
phylo_signal_mcmcglmm_BL %>% mean()
[1] 0.8541816
phylo_signal_mcmcglmm_BL %>% quantile(probs = c(0.025,0.5,0.975))
     2.5%       50%     97.5% 
0.6129464 0.8776149 0.9600353 
# brms
phylo_signal_brms_BL <- brms_BL1 %>% as_tibble() %>% 
        dplyr::select(Sigma_phy = sd_PhyloName__Intercept) %>% 
        mutate(lambda_logit = (Sigma_phy^2 / (Sigma_phy^2+1))) %>% 
        pull(lambda_logit)

phylo_signal_brms_BL %>% mean()
[1] 0.8411519
phylo_signal_brms_BL %>% quantile(probs = c(0.025,0.5,0.975))
     2.5%       50%     97.5% 
0.5941574 0.8636453 0.9520353 

Both estimated phylogenetic signals took close values.

One continuous explanatory variable model

For the next step, we examine whether the group size affect the presence or absence of red pelagic on the body limbs in primates to test the possibility that red pelage is related to social roles and behaviours within the group.

Probit model

The model using MCMCglmm is

inv.phylo <- inverseA(p_tree, nodes = "ALL", scale = TRUE)
prior1 <- list(R = list(V = 1, fix = 1),
               G = list(G1 = list(V = 1, nu = 1, alpha.mu = 0, alpha.V = 10)
                       )
               )

system.time(
  mcmcglmm_BP2 <- MCMCglmm(red_pelage_body_limbs ~ cSocial_group_size, 
                            random = ~ PhyloName, 
                            family = "threshold", 
                            data = p_dat, 
                            prior = prior1, 
                            ginverse = list(PhyloName = inv.phylo$Ainv),
                            nitt = 13000*30, 
                            thin = 10*30, 
                            burnin = 3000*30)
            )

For brms

A <- ape::vcv.phylo(p_tree, corr = TRUE)

priors_brms3 <- default_prior(red_pelage_body_limbs ~ cSocial_group_size + (1 | gr(PhyloName, cov = A)), 
                             data = p_dat, 
                             data2 = list(A = A),
                             family = bernoulli(link = "probit"))

system.time(
  brms_BP2 <- brm(red_pelage_body_limbs ~ cSocial_group_size + (1 | gr(PhyloName, cov = A)), 
                 data = p_dat, 
                 data2 = list(A = A),
                 family = bernoulli(link = "probit"),
                 prior = priors_brms3,
                 iter = 10000, 
                 warmup = 8500, 
                 thin = 1,
                 chain = 2,
                 control = list(adapt_delta = 0.95),
                 core = 2,
                 thread = threading(5)
         )
      )
# MCMCglmm
summary(mcmcglmm_BP2)

 Iterations = 90001:389701
 Thinning interval  = 300
 Sample size  = 1000 

 DIC: 299.4154 

 G-structure:  ~PhyloName

          post.mean l-95% CI u-95% CI eff.samp
PhyloName     3.489   0.2165    7.618     1000

 R-structure:  ~units

      post.mean l-95% CI u-95% CI eff.samp
units         1        1        1        0

 Location effects: red_pelage_body_limbs ~ cSocial_group_size 

                   post.mean l-95% CI u-95% CI eff.samp pMCMC
(Intercept)          -0.9754  -2.8047   0.8503     1979 0.258
cSocial_group_size   -0.1136  -0.3679   0.1707     1091 0.418
posterior_summary(mcmcglmm_BP2$Sol)
                     Estimate Est.Error       Q2.5     Q97.5
(Intercept)        -0.9753971 0.9209577 -2.9483573 0.7464915
cSocial_group_size -0.1135599 0.1364510 -0.4137361 0.1260902
posterior_summary(mcmcglmm_BP2$VCV)
          Estimate Est.Error      Q2.5    Q97.5
PhyloName 3.488721  2.419961 0.6210229 8.835258
units     1.000000  0.000000 1.0000000 1.000000
#brms
summary(brms_BP2)
 Family: bernoulli 
  Links: mu = probit 
Formula: red_pelage_body_limbs ~ cSocial_group_size + (1 | gr(PhyloName, cov = A)) 
   Data: p_dat (Number of observations: 265) 
  Draws: 2 chains, each with iter = 10000; warmup = 8500; thin = 1;
         total post-warmup draws = 3000

Multilevel Hyperparameters:
~PhyloName (Number of levels: 265) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     1.68      0.53     0.78     2.86 1.00      440      944

Regression Coefficients:
                   Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept             -0.75      0.85    -2.54     0.88 1.00     1318     1299
cSocial_group_size    -0.11      0.13    -0.40     0.12 1.00     3127     1926

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

Both models suggest that social group size does not significantly affect the presence or absence of red pelage on the body limbs in primates, as the credible intervals include zero and the p-values are higher than 0.05.

Logit model

The model for MCMCglmm…

inv.phylo <- inverseA(p_tree, nodes = "ALL", scale = TRUE)
prior1 <- list(R = list(V = 1, fix = 1),
               G = list(G1 = list(V = 1, nu = 1, alpha.mu = 0, alpha.V = 10)
                       )
               )

system.time(
  mcmcglmm_BL2 <- MCMCglmm(red_pelage_body_limbs ~ cSocial_group_size, 
                            random = ~ PhyloName, 
                            family = "categorical", 
                            data = p_dat, 
                            prior = prior1, 
                            ginverse = list(PhyloName = inv.phylo$Ainv),
                            nitt = 13000*60, 
                            thin = 10*60, 
                            burnin = 3000*60)
            )

For brms…

A <- ape::vcv.phylo(p_tree, corr = TRUE)

priors_brms4 <- default_prior(red_pelage_body_limbs ~ cSocial_group_size + (1 | gr(PhyloName, cov = A)), 
                              data = p_dat, 
                              data2 = list(A = A),
                              family = bernoulli(link = "logit"))

system.time(
  brms_biL2 <- brm(red_pelage_body_limbs ~ cSocial_group_size + (1 | gr(PhyloName, cov = A)), 
                   data = p_dat, 
                   data2 = list(A = A),
                   family = bernoulli(link = "logit"),
                   prior = priors_brms4,
                   iter = 18000, 
                   warmup = 8000, 
                   thin = 1,
                   chain = 2,
                   control = list(adapt_delta = 0.95),
                   core = 2,
                   thread = threading(5)
                  )
            )
summary(mcmcglmm_BL2)

 Iterations = 180001:779401
 Thinning interval  = 600
 Sample size  = 1000 

 DIC: 294.8728 

 G-structure:  ~PhyloName

          post.mean l-95% CI u-95% CI eff.samp
PhyloName     11.95   0.8832    26.74      835

 R-structure:  ~units

      post.mean l-95% CI u-95% CI eff.samp
units         1        1        1        0

 Location effects: red_pelage_body_limbs ~ cSocial_group_size 

                   post.mean l-95% CI u-95% CI eff.samp pMCMC
(Intercept)          -1.8090  -5.3966   1.7961     1000 0.266
cSocial_group_size   -0.2622  -0.8166   0.2671     1000 0.356
c2 <- (16 * sqrt(3) / (15 * pi))^2
res_1 <- mcmcglmm_BL2$Sol/sqrt(1+c2)
res_2 <- mcmcglmm_BL2$VCV/(1+c2) 

posterior_summary(res_1)
                     Estimate Est.Error       Q2.5     Q97.5
(Intercept)        -1.5593137 1.5801956 -4.9135095 1.3437762
cSocial_group_size -0.2260184 0.2484719 -0.7398858 0.2081084
posterior_summary(res_2)
           Estimate Est.Error      Q2.5      Q97.5
PhyloName 8.8770592  5.968142 1.4813098 23.2748934
units     0.7430287  0.000000 0.7430287  0.7430287
# brms
summary(brms_BP2)
 Family: bernoulli 
  Links: mu = probit 
Formula: red_pelage_body_limbs ~ cSocial_group_size + (1 | gr(PhyloName, cov = A)) 
   Data: p_dat (Number of observations: 265) 
  Draws: 2 chains, each with iter = 10000; warmup = 8500; thin = 1;
         total post-warmup draws = 3000

Multilevel Hyperparameters:
~PhyloName (Number of levels: 265) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     1.68      0.53     0.78     2.86 1.00      440      944

Regression Coefficients:
                   Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept             -0.75      0.85    -2.54     0.88 1.00     1318     1299
cSocial_group_size    -0.11      0.13    -0.40     0.12 1.00     3127     1926

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

Estimates from both packages were similar and the logit model yielded the same results as the probit model: the size of the social group has no significant effect on this trait.

One continuous and one categorical explanatory variable model

Therefore, we hypothesised that there might be a relationship between the presence of red pelage and whether a primate species is diurnal or not. We included this as a predictor variable in our model. This is because red colouration may not be as visible in the dark, making it potentially less advantageous for nocturnal species. On the other hand, diurnal species may benefit more from the visibility of red pelage, which could play a role in social/sexual signalling or other ecological factors during the day. Note that the activity cycle variable comprised 01 data, whether diurnal or not, to simplify understanding the model outputs.

Probit model

For the model MCMCglmm the following:

p_dat$diurnal <- ifelse(p_dat$activity_cycle == "di", 1, 0)

inv.phylo <- inverseA(p_tree, nodes = "ALL", scale = TRUE)
prior1 <- list(R = list(V = 1, fix = 1),
               G = list(G1 = list(V = 1, nu = 1, alpha.mu = 0, alpha.V = 10)
                       )
               )

system.time(
    mcmcglmm_BP3 <- MCMCglmm(red_pelage_body_limbs ~ cSocial_group_size + diurnal, 
                              random = ~ PhyloName, 
                              family = "threshold", 
                              data = p_dat, 
                              prior = prior1, 
                              ginverse = list(PhyloName = inv.phylo$Ainv),
                              nitt = 13000*50, 
                              thin = 10*50, 
                              burnin = 3000*50)
            )

For brms,

A <- ape::vcv.phylo(p_tree, corr = TRUE)

priors_brms5 <- default_prior(red_pelage_body_limbs ~ cSocial_group_size + diurnal + (1 | gr(PhyloName, cov = A)), 
                             data = p_dat, 
                             data2 = list(A = A),
                             family = bernoulli(link = "probit"))

system.time(
    brms_BP3 <- brm(red_pelage_body_limbs ~ cSocial_group_size + diurnal + (1 | gr(PhyloName, cov = A)), 
                     data = p_dat, 
                     data2 = list(A = A),
                     family = bernoulli(link = "probit"),
                     prior = priors_brms5,
                     iter = 5000, 
                     warmup = 3000, 
                     thin = 1,
                     chain = 2,
                     control = list(adapt_delta = 0.95),
                     core = 2,
                     thread = threading(5)
             )
      )
# MCMCglmm
summary(mcmcglmm_BP3)

 Iterations = 150001:649501
 Thinning interval  = 500
 Sample size  = 1000 

 DIC: 299.2849 

 G-structure:  ~PhyloName

          post.mean l-95% CI u-95% CI eff.samp
PhyloName     4.063    0.535    9.133    900.6

 R-structure:  ~units

      post.mean l-95% CI u-95% CI eff.samp
units         1        1        1        0

 Location effects: red_pelage_body_limbs ~ cSocial_group_size + activity_cycle 

                   post.mean  l-95% CI  u-95% CI eff.samp pMCMC
(Intercept)        -0.685030 -3.381045  1.969969     1000 0.584
cSocial_group_size -0.110831 -0.373846  0.151850     1000 0.392
activity_cycledi   -0.533313 -2.486488  1.364789     1000 0.572
activity_cyclenoct -0.007557 -1.807080  1.823124     1000 0.986
posterior_summary(mcmcglmm_BP3$VCV)
          Estimate Est.Error     Q2.5    Q97.5
PhyloName 4.062649  2.673005 0.947084 11.42444
units     1.000000  0.000000 1.000000  1.00000
posterior_summary(mcmcglmm_BP3$Sol)
                       Estimate Est.Error       Q2.5     Q97.5
(Intercept)        -0.685030297 1.3457124 -3.6844040 1.7696060
cSocial_group_size -0.110831330 0.1342969 -0.4257179 0.1235912
activity_cycledi   -0.533313064 0.9653465 -2.6461670 1.2170270
activity_cyclenoct -0.007557281 0.9589911 -1.9164287 1.7255631
# brms
summary(brms_BP3)
 Family: bernoulli 
  Links: mu = probit 
Formula: red_pelage_body_limbs ~ cSocial_group_size + activity_cycle + (1 | gr(PhyloName, cov = A)) 
   Data: p_dat (Number of observations: 265) 
  Draws: 2 chains, each with iter = 18000; warmup = 8000; thin = 1;
         total post-warmup draws = 20000

Multilevel Hyperparameters:
~PhyloName (Number of levels: 265) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     1.86      0.58     0.89     3.21 1.00     3870     6693

Regression Coefficients:
                   Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept             -0.54      1.21    -2.96     1.87 1.00     9333    10083
cSocial_group_size    -0.11      0.13    -0.40     0.13 1.00    21096    14055
activity_cycledi      -0.50      0.93    -2.46     1.24 1.00     9758     9062
activity_cyclenoct    -0.01      0.96    -2.00     1.81 1.00     9588     9736

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

The results indicate that both models (MCMCglmm and brms) are reliable based on diagnostic metrics. Effective sample sizes are sufficient, and Rhat values are close to 1, indicating good convergence.

For random effects, the estimated variance of the phylogenetic effect is consistent between models (MCMCglmm: 3.59; brms: 1.73^2 = 2.99), with overlapping 95% credible intervals. This suggests still phylogenetic influence on red pelage presence, though with some uncertainty.

For fixed effects, neither social group size (MCMCglmm: -0.10, brms: -0.11) nor diurnal (MCMCglmm: -0.49, brms: -0.48) showed statistically significant effects, as their 95% CIs include zero. This suggests that group size and diurnal activity are unlikely to be major determinants of red pelage presence.

Logit model Finally, MCMCglmm
inv.phylo <- inverseA(p_tree, nodes = "ALL", scale = TRUE)
prior1 <- list(R = list(V = 1, fix = 1),
               G = list(G1 = list(V = 1, nu = 1, alpha.mu = 0, alpha.V = 10)
                       )
               )

system.time(
    mcmcglmm_BL3 <- MCMCglmm(red_pelage_body_limbs ~ cSocial_group_size + diurnal, 
                              random = ~ PhyloName, 
                              family = "categorical", 
                              data = p_dat, 
                              prior = prior1, 
                              ginverse = list(PhyloName = inv.phylo$Ainv),
                              nitt = 13000*50, 
                              thin = 10*50, 
                              burnin = 3000*50)
           )

And brms,

A <- ape::vcv.phylo(p_tree, corr = TRUE)
priors_brms6 <- default_prior(red_pelage_body_limbs ~ cSocial_group_size + diurnal + (1 | gr(PhyloName, cov = A)), 
                              data = p_dat, 
                              data2 = list(A = A),
                              family = bernoulli(link = "logit"))

system.time(
  brms_BL3 <- brm(red_pelage_body_limbs ~ cSocial_group_size + diurnal + (1 | gr(PhyloName, cov = A)), 
                   data = p_dat, 
                   data2 = list(A = A),
                   family = bernoulli(link = "logit"),
                   prior = priors_brms6,
                   iter = 18000, 
                   warmup = 8000, 
                   thin = 1,
                   chain = 2,
                   control = list(adapt_delta = 0.95),
                   core = 2,
                   thread = threading(5)
                   )
             )
summary(mcmcglmm_BL3)

 Iterations = 240001:1039201
 Thinning interval  = 800
 Sample size  = 1000 

 DIC: 293.6985 

 G-structure:  ~PhyloName

          post.mean l-95% CI u-95% CI eff.samp
PhyloName     15.16    1.792    37.31    905.8

 R-structure:  ~units

      post.mean l-95% CI u-95% CI eff.samp
units         1        1        1        0

 Location effects: red_pelage_body_limbs ~ cSocial_group_size + activity_cycle 

                   post.mean l-95% CI u-95% CI eff.samp pMCMC
(Intercept)         -1.30757 -6.63033  3.76436     1000 0.594
cSocial_group_size  -0.23416 -0.76302  0.28469     1000 0.392
activity_cycledi    -1.15936 -5.18073  2.29078     1000 0.528
activity_cyclenoct  -0.09113 -3.90477  3.45304     1000 0.980
c2 <- (16 * sqrt(3) / (15 * pi))^2
res_1 <- mcmcglmm_BL3$Sol/sqrt(1+c2)
res_2 <- mcmcglmm_BL3$VCV/(1+c2) 

posterior_summary(res_1)
                      Estimate Est.Error       Q2.5     Q97.5
(Intercept)        -1.12711731 2.2299457 -5.7979353 3.1769798
cSocial_group_size -0.20184346 0.2376961 -0.7177624 0.1888044
activity_cycledi   -0.99936034 1.6233908 -4.3827738 2.1066431
activity_cyclenoct -0.07855316 1.6257651 -3.4238818 2.9358848
posterior_summary(res_2)
            Estimate Est.Error      Q2.5      Q97.5
PhyloName 11.2672019  8.147431 2.2323210 31.2476370
units      0.7430287  0.000000 0.7430287  0.7430287
summary(brms_BL3)
 Family: bernoulli 
  Links: mu = logit 
Formula: red_pelage_body_limbs ~ cSocial_group_size + activity_cycle + (1 | gr(PhyloName, cov = A)) 
   Data: p_dat (Number of observations: 265) 
  Draws: 2 chains, each with iter = 18000; warmup = 8000; thin = 1;
         total post-warmup draws = 20000

Multilevel Hyperparameters:
~PhyloName (Number of levels: 265) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     2.93      0.96     1.33     5.06 1.00     3607     5803

Regression Coefficients:
                   Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept             -0.62      1.88    -4.35     3.21 1.00     9854    10536
cSocial_group_size    -0.21      0.25    -0.79     0.21 1.00    19703    12846
activity_cycledi      -0.81      1.54    -4.04     2.07 1.00     9439     8872
activity_cyclenoct    -0.07      1.56    -3.29     2.93 1.00     9517     9450

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

Similar results were obtained with the logit model as with the probit model. In conclusion, the models are consistent and suggest that neither social group size nor diurnal activity significantly affect red pelage presence.

Bivariate model

ADD: explanations #### Probit link

inv.phylo <- inverseA(p_tree, nodes = "ALL", scale = TRUE)  # invert covariance matrix for use by MCMCglmm 
prior2 <- list(G = list(G1 = list(V = diag(2), 
                                  nu = 2, alpha.mu = rep(0, 2), 
                                  alpha.V = diag(2) * 10)),
               R = list(V = diag(2), fix = 1)
              )

system.time(
        mcmc_BPB1 <- MCMCglmm(cbind(red_pelage_body_limbs, red_pelage_head) ~ trait - 1,
                        random = ~ us(trait):PhyloName,
                        rcov = ~ us(trait):units,
                        family = c("threshold", "threshold"),
                        data = p_dat, 
                        prior = prior2, 
                        ginverse = list(PhyloName = inv.phylo$Ainv),
                        nitt = 13000*25, 
                        thin = 10*25, 
                        burnin = 3000*25
                                )
        )

system.time(
        mcmc_BPB2 <- MCMCglmm(cbind(red_pelage_body_limbs, red_pelage_head) ~ cSocial_group_size:trait + trait - 1,
                        random = ~ us(trait):PhyloName,
                        rcov = ~ us(trait):units,
                        family = c("threshold", "threshold"),
                        data = p_dat, 
                        prior = prior2, 
                        ginverse = list(PhyloName = inv.phylo$Ainv),
                        nitt = 13000*55, 
                        thin = 10*55, 
                        burnin = 3000*55
                                )
        )

system.time(
        mcmc_BPB3 <- MCMCglmm(cbind(red_pelage_body_limbs, red_pelage_head) ~ cSocial_group_size:trait + diurnal :trait + trait - 1,
                        random = ~ us(trait):PhyloName,
                        rcov = ~ us(trait):units,
                        family = c("threshold", "threshold"),
                        data = p_dat, 
                        prior = prior2, 
                        ginverse = list(PhyloName = inv.phylo$Ainv),
                        nitt = 13000*100, 
                        thin = 10*100, 
                        burnin = 3000*100
                                )
        )

A <- ape::vcv.phylo(p_tree, corr = TRUE)
formula_biPv1 <- bf(mvbind(red_pelage_body_limbs, red_pelage_head) ~ 1 +
                    (1|a|gr(PhyloName, cov = A))
                    )

default_prior2 <- default_prior(formula_biPv1, 
                               data = p_dat, 
                               data2 = list(A = A),
                               family = bernoulli(link = "probit")
                               )

system.time(
  brms_BPB1 <- brm(formula = formula_biPv1, 
                data = p_dat, 
                data2 = list(A = A),
                family = bernoulli(link = "probit"), 
                prior = default_prior2,
                iter = 25000, 
                warmup = 5000, 
                thin = 1,
                chain = 2,
                control = list(adapt_delta = 0.95)
                )
             )

formula_biPv2 <- bf(mvbind(red_pelage_body_limbs, red_pelage_head) ~ cSocial_group_size +
                    (1|a|gr(PhyloName, cov = A))
                    # set_rescor(TRUE)
                    )

default_prior3 <- default_prior(formula_biPv2, 
                               data = p_dat, 
                               data2 = list(A = A),
                               family = bernoulli(link = "probit")
                               )

system.time(
  brms_BPB2 <- brm(formula = formula_biPv2, 
                data = p_dat, 
                data2 = list(A = A),
                family = bernoulli(link = "probit"), 
                prior = default_prior3,
                iter = 35000, 
                warmup = 15000, 
                thin = 1,
                chain = 2,
                control = list(adapt_delta = 0.95)
                )
             )


formula_biPv3 <- bf(mvbind(red_pelage_body_limbs, red_pelage_head) ~ cSocial_group_size + diurnal +
                    (1|a|gr(PhyloName, cov = A))
                    # set_rescor(TRUE)
                    )

default_prior4 <- default_prior(formula_biPv3, 
                               data = p_dat, 
                               data2 = list(A = A),
                               family = bernoulli(link = "probit")
                               )

system.time(
  brms_BPB3 <- brm(formula = formula_biPv3, 
                data = p_dat, 
                data2 = list(A = A),
                family = bernoulli(link = "probit"), 
                prior = default_prior4,
                iter = 35000, 
                warmup = 25000, 
                thin = 1,
                chain = 2,
                control = list(adapt_delta = 0.95)
                )
             )

3. Ordinal (threshold) models (ordered multinomial model)

In the ordinal threshold model and nominal model, we will show 2 realistic examples. As we explained in the BOX2 on main text, MCMCglmm and brms take different parametrisation for the threshold model. At first sight, the outputs about thresholds (cutpoints) appear to be quite difficult. But do not worry about it,we will guide you step by step to understand and interpret these outputs. By the end of this section, you will be able to compare the threshold estimates from both models and understand their implications.

Explanation of dataset

We used the dataset from AVONET (Tobias et al. (2022)) and the phylogenetic tree from BirdTree.

For first example, we tested the relationships between migration level (response variable) and body mass and habitat density in birds of prey (136 species).

For second example, we investigated the difference of habitat density within family Phasianidae (179 species) are affected by tail length and diet (herbivore).

Explanation of variables

migration level: “1 = Sedentary. 2 = Partially migratory, i.e. minority of population migrates long distances, or most of population undergoes short-distance migration, nomadic movements, distinct altitudinal migration, etc. 3 = Migratory, i.e. majority of population undertakes long-distance migration”

habitat density: 1 = Dense habitats. Species primarily lives in the lower or middle storey of forest, or in dense thickets, dense shrubland etc. 2 = Semi-open habitats. Species primarily lives in open shrubland, scattered bushes, parkland, low dry or deciduous forest, thorn forest. 3 = Open habitats. Species primarily lives in desert, grassland, open water, low shrubs, rocky habitats, seashores, cities. Also applies to species living mainly on top of forest canopy (i.e. mostly in the open)

body mass: Average mass of species both adult males and females

tail length: The distance from the tip of the longest tail feather to the point where the two central tail feathers emerge from the body. This is usually measured by inserting a ruler between the two central tail feathers

herbivore: Species obtaining at least 70% of food resources from plant.

Datasets for example 1

dat <- read.csv(here("data", "bird body mass", "accipitridae_sampled.csv"))
dat <- dat %>% 
 mutate(across(c(Trophic.Level, Trophic.Niche, Primary.Lifestyle, Migration, Habitat, Species.Status), as.factor))

# rename: numbers to descriptive name
dat$Migration_ordered <- factor(dat$Migration,  levels = c("sedentary", "partially_migratory", "migratory"), ordered = TRUE)
dat <- dat %>%
 mutate(Habitat.Density = case_when(
    Habitat.Density == 1 ~ "dense",
    Habitat.Density == 2 ~ "semi-open",
    Habitat.Density == 3 ~ "open",
    TRUE ~ as.character(Habitat.Density)
  ))

table(dat$Habitat.Density)
dat$logMass <- log(dat$Mass)
hist(dat$logMass)
boxplot(logMass ~ Habitat.Density, data = dat)
summary(dat)
trees <- read.nexus(here("data", "bird body mass", "accipitridae_sampled.nex"))
tree <- trees[[1]]

summary(dat)

Datasets for example 2

dat <- read.csv(here("data", "bird body mass", "9993spp_clearned.csv"))
dat <- dat %>% 
  mutate(across(c(Trophic.Level, Trophic.Niche,
                  Primary.Lifestyle, Migration, Habitat, Habitat.Density , Species.Status), as.factor),
         Habitat.Density = factor(Habitat.Density, ordered = TRUE))

Family_HD <- dat %>% 
  group_by(Family, Habitat.Density) %>%
  tally() %>%
  group_by(Family) %>%
  filter(all(n >= 5)) %>%
  ungroup()

Family_HD <- dat %>%
  filter(Family %in% c("Phasianidae", "Ploceidae", "Sturnidae")) %>% 
  count(Family, Habitat.Density)

Phasianidae_dat <- dat %>% 
  filter(Family == "Phasianidae") %>% 
  mutate(Habitat.Density = case_when(
  Habitat.Density == 1 ~ "dense",
  Habitat.Density == 2 ~ "semi-open",
  Habitat.Density == 3 ~ "open"
  ),
  Habitat.Density = factor(Habitat.Density, ordered = TRUE)
  )
Phasianidae_dat <- Phasianidae_dat %>%
  mutate(Habitat_Forest = ifelse(Habitat == "Forest", 1, 0),
         Habitat_Forest = replace_na(Habitat_Forest, 0))

Phasianidae_dat$cTail_length <- scale(log(Phasianidae_dat$Tail.Length), center = TRUE, scale = FALSE)

View(Phasianidae_dat)

Phasianidae_tree <- read.nexus(here("data", "bird body mass", "Phasianidae.nex"))
tree <- Phasianidae_tree[[1]]

Example 1

Intercept-only model

MCMCglmm

When using MCMCglmm for ordinal data, you can choose between two options for the family parameter: "threshold" and "ordinal". These options differ in how they handle residual variances and error distributions.

We generally recommend using family = "threshold" for following reasons - directly corresponds to the standard probit regression with a variance of 1, no correcting of outputs is necessary, and it provides a more computationally efficient approach.

We show both models through this section;

inv_phylo <- inverseA(tree, nodes = "ALL", scale = TRUE)
prior1 <- list(R = list(V = 1, fix = 1), 
              G = list(G1 = list(V = 1, nu = 1, alpha.mu = 0, alpha.V = 10)))

system.time(
  mcmcglmm_T1_1 <- MCMCglmm(Migration_ordered ~ 1,
                        random = ~ Phylo,
                        ginverse = list(Phylo = inv_phylo$Ainv),
                        family = "threshold",
                        data = dat,
                        prior = prior1,
                        nitt = 13000*40, 
                        thin = 10*40, 
                        burnin = 3000*40 
                        )
            )

system.time(
  mcmcglmm_O1_1 <- MCMCglmm(Migration_ordered ~ 1,
                        random = ~ Phylo,
                        ginverse = list(Phylo = inv_phylo$Ainv),
                        family = "ordinal",
                        data = dat,
                        prior = prior1,
                        nitt = 13000*40, 
                        thin = 10*40, 
                        burnin = 3000*40 
                        )
            )

For ordinal model, we need the correction;

c2 <- 1
res_1 <- model$Sol / sqrt(1+c2) # for fixed effects
res_2 <- model$VCV / (1+c2) # for variance components
summary(mcmcglmm_T1_1) # 95%HPD Interval

 Iterations = 120001:519601
 Thinning interval  = 400
 Sample size  = 1000 

 DIC: 297.9054 

 G-structure:  ~Phylo

      post.mean  l-95% CI u-95% CI eff.samp
Phylo      1.43 1.196e-06    4.766     1000

 R-structure:  ~units

      post.mean l-95% CI u-95% CI eff.samp
units         1        1        1        0

 Location effects: Migration_ordered ~ 1 

            post.mean l-95% CI u-95% CI eff.samp pMCMC
(Intercept)    0.3178  -0.7651   1.5749     1000 0.464

 Cutpoints: 

                                  post.mean l-95% CI u-95% CI eff.samp
cutpoint.traitMigration_ordered.1     1.011   0.7089    1.408     1000
posterior_summary(mcmcglmm_T1_1$VCV) # 95%CI 
      Estimate Est.Error      Q2.5   Q97.5
Phylo 1.429685  2.187129 0.0020836 6.25389
units 1.000000  0.000000 1.0000000 1.00000
posterior_summary(mcmcglmm_T1_1$Sol)
             Estimate Est.Error       Q2.5    Q97.5
(Intercept) 0.3178127 0.5848262 -0.8350972 1.513238
posterior_summary(mcmcglmm_T1_3$CP)
                                  Estimate Est.Error      Q2.5    Q97.5
cutpoint.traitMigration_ordered.1 1.116292 0.2420588 0.7754895 1.677832
summary(mcmcglmm_O1_1) # 95%HPD Interval

 Iterations = 120001:519601
 Thinning interval  = 400
 Sample size  = 1000 

 DIC: 267.7801 

 G-structure:  ~Phylo

      post.mean  l-95% CI u-95% CI eff.samp
Phylo      2.94 1.447e-05    10.78    779.5

 R-structure:  ~units

      post.mean l-95% CI u-95% CI eff.samp
units         1        1        1        0

 Location effects: Migration_ordered ~ 1 

            post.mean l-95% CI u-95% CI eff.samp pMCMC
(Intercept)    0.4319  -1.0821   2.2693    889.5 0.456

 Cutpoints: 

                                  post.mean l-95% CI u-95% CI eff.samp
cutpoint.traitMigration_ordered.1      1.43   0.9309    1.959    817.1
posterior_summary(mcmcglmm_O1_1$VCV) # 95%CI 
      Estimate Est.Error        Q2.5    Q97.5
Phylo 2.940261  4.749804 0.003052149 13.92029
units 1.000000  0.000000 1.000000000  1.00000
posterior_summary(mcmcglmm_O1_1$Sol)
             Estimate Est.Error      Q2.5    Q97.5
(Intercept) 0.4319314 0.7973933 -1.170937 2.245297
c2 <- 1
res_1 <- mcmcglmm_O1_1$Sol / sqrt(1+c2) # for fixed effects
res_2 <- mcmcglmm_O1_1$VCV / (1+c2) # for variance components

res_3 <- mcmcglmm_O1_1$CP / sqrt(1+c2) # for cutpoint
posterior_summary(res_1)
             Estimate Est.Error       Q2.5    Q97.5
(Intercept) 0.3054216 0.5638422 -0.8279773 1.587665
posterior_summary(res_2)
      Estimate Est.Error        Q2.5    Q97.5
Phylo 1.470131  2.374902 0.001526075 6.960145
units 0.500000  0.000000 0.500000000 0.500000
posterior_summary(res_3)
                                  Estimate Est.Error      Q2.5    Q97.5
cutpoint.traitMigration_ordered.1 1.011203 0.1934441 0.7019355 1.462073

the ordinal models show a smaller effect size compared to the threshold model, but the difference is minor and should not be an issue. The convergences appear to be satisfactory for both models. Also, it is well-known that ordinal models in MCMCglmm tend to be more difficult to converge compared to the threshold models. Despite this, the results obtained from both models are largely consistent, showing minimal difference. As you can see here, residual variance is fixed 1 - the model did not estimate the residual variance (R-structure).

Look at the output more closely - only one cutpoint is provided although our response variable is three ordered categorical traits! As we have explained in main text, MCMCglmm implicitly assumes that the first cut point is always set to 0, making the intercept represent the centre of the distribution.

brms

The family argument is set to cumulative() with link = "probit", which uses the standard probit parametrisation where the residual variance is fixed at 1. Since this is the standard parametrisation for ordinal probit regression, no correcting of coefficients is necessary.

A <- ape::vcv.phylo(tree, corr = TRUE)

default_priors <- default_prior(
           Migration_ordered ~ 1 + (1 | gr(Phylo, cov = A)),
                    data = dat,
                    family = cumulative(link = "probit"),
                    data2 = list(A = A)
                    )

# Fit the model
system.time(
 brms_OT1_1 <- brm(
               formula = Migration_ordered ~ 1 + (1 | gr(Phylo, cov = A)),
               data = dat,
               family = cumulative(link = "probit"),
               data2 = list(A = A),
               prior = default_priors,
               iter = 19000,
               warmup =16000,
               thin = 1,
               chain = 2,
               control = list(adapt_delta = 0.99),
                )
            )
summary(brms_OT1_1)
 Family: cumulative 
  Links: mu = probit; disc = identity 
Formula: Migration_ordered ~ 1 + (1 | gr(Phylo, cov = A)) 
   Data: dat (Number of observations: 136) 
  Draws: 2 chains, each with iter = 20000; warmup = 10000; thin = 1;
         total post-warmup draws = 20000

Multilevel Hyperparameters:
~Phylo (Number of levels: 136) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     0.85      0.61     0.04     2.28 1.00     1594     3377

Regression Coefficients:
             Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept[1]    -0.32      0.47    -1.36     0.61 1.00    12133     7267
Intercept[2]     0.66      0.47    -0.24     1.75 1.00    11680     8235

Further Distributional Parameters:
     Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
disc     1.00      0.00     1.00     1.00   NA       NA       NA

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

In brms, now you can see the two intercepts and no cutpoint in the output! In fact, the intercept is described each cutpoint. However, the results (how many species fall into each category) show little difference between MCMCglmm and brms, as can be seen below.

Probabilities

We can obtain the probabilities for each category (migrate level: sedentary, partially migratory, migratory) using pnorm(). We made simple functions to calculate these and use them for the threshold models in MCMcglmm and brms.

# l = intercept
calculate_probabilities <- function(cutpoint0, cutpoint1, l) {
  
  category1_prob <- pnorm(cutpoint0 - l)
  category2_prob <- pnorm(cutpoint1 - l) - pnorm(cutpoint0 - l)
  category3_prob <- 1 - pnorm(cutpoint1 - l)
  
  return(c(sedentary = category1_prob, 
           partially_migratory = category2_prob, 
           migratory = category3_prob)
         )
  
  }
probabilities_mcmcglmm <- calculate_probabilities(0, 1.01, 0.3178)
print(probabilities_mcmcglmm)
          sedentary partially_migratory           migratory 
          0.3753183           0.3802758           0.2444059 
probabilities_brms <- calculate_probabilities(-0.32, 0.66, 0)
print(probabilities_brms)
          sedentary partially_migratory           migratory 
          0.3744842           0.3708889           0.2546269 

The results obtained in MCMCglmm are as follows: Sedentary: 0.3753183, Partially migratory: 0.3802758, and Migratory: 0.2444059. In brms, the results are Sedentary: 0.3744842, Partially migratory: 0.3708889, and Migratory: 0.2546269. As shown, although MCMCglmm and brms use parametrisations of and , respectively, the results are very similar (see Box 2 in main text).

Phylogenetic signals

# MCMCglmm
phylo_signal_mcmcglmm_T <- ((mcmcglmm_T1_1$VCV[, "Phylo"]) / (mcmcglmm_T1_1$VCV[, "Phylo"] + 1))
phylo_signal_mcmcglmm_T %>% mean()
[1] 0.4172334
phylo_signal_mcmcglmm_T %>% quantile(probs = c(0.025, 0.5, 0.975))
       2.5%         50%       97.5% 
0.002079267 0.430888403 0.862142904 
phylo_signal_mcmcglmm_O <- ((mcmcglmm_O1_1$VCV[, "Phylo"]/(1+c2)) / (mcmcglmm_O1_1$VCV[, "Phylo"]/(1+c2) + 1))
phylo_signal_mcmcglmm_O %>% mean()
[1] 0.4080353
phylo_signal_mcmcglmm_O %>% quantile(probs = c(0.025,0.5,0.975))
       2.5%         50%       97.5% 
0.001523749 0.417232582 0.874374073 
# brms
phylo_signal_brms_OT <- brms_OT1_1 %>% as_tibble() %>% 
        dplyr::select(Sigma_phy = sd_Phylo__Intercept) %>% 
        mutate(lambda = Sigma_phy^2 / (Sigma_phy^2 + 1)) %>% 
        pull(lambda)

phylo_signal_brms_OT %>% mean()
[1] 0.3692979
phylo_signal_brms_OT %>% quantile(probs = c(0.025,0.5,0.975))
       2.5%         50%       97.5% 
0.001264342 0.359406642 0.838180812 

In both MCMCglmm (threshold and ordinal models) and brms, the phylogenetic signal is estimated to be moderate. The moderately high phylogenetic signal suggests that closely related species tend to exhibit similar migratory behaviours, indicating that phylogeny plays a role in shaping these traits. However, this also implies that additional factors, such as environmental or ecological influences, likely contribute to the variation in migratory behavior. This balance between phylogenetic constraints and external influences may reflect evolutionary flexibility in migratory strategies among species.

One continuous explanatory variable model

The model using MCMCglmm is

inv_phylo <- inverseA(tree, nodes = "ALL", scale = TRUE)
prior1 <- list(R = list(V = 1, fix = 1), 
              G = list(G1 = list(V = 1, nu = 1, alpha.mu = 0, alpha.V = 10)))
 
system.time(
   mcmcglmm_T1_2 <- MCMCglmm(Migration_ordered ~ logMass,
                    random = ~ Phylo,
                    ginverse = list(Phylo = inv_phylo$Ainv),
                    family = "threshold",
                    data = dat,
                    prior = prior1,
                    nitt = 13000*45, 
                    thin = 10*45, 
                    burnin = 3000*45 
                    )
            )

system.time(
   mcmcglmm_O1_2 <- MCMCglmm(Migration_ordered ~ logMass,
                        random = ~ Phylo,
                        ginverse = list(Phylo = inv_phylo$Ainv),
                        family = "ordinal",
                        data = dat,
                        prior = prior1,
                        nitt = 13000*60, 
                        thin = 10*60, 
                        burnin = 3000*60 
                        )
            )

For brms

A <- ape::vcv.phylo(tree,corr = TRUE)


default_priors2 <- default_prior(
           Migration_ordered ~ logMass + (1 | gr(Phylo, cov = A)),
                    data = dat,
                    family = cumulative(link = "probit"),
                    data2 = list(A = A)
                    )

# Fit the model
system.time(
  brms_OT1_2 <- brm(
                    formula = Migration_ordered ~ logMass + (1 | gr(Phylo, cov = A)),
                    data = dat,
                    family = cumulative(link = "probit"),
                    data2 = list(A = A),
                    prior = default_priors2,
                    iter = 20000,
                    warmup = 10000,
                    thin = 1,
                    chain = 2,
                    core = 2,
                    control = list(adapt_delta = 0.99)
                    )
            )
# MCMCglmm
summary(mcmcglmm_T1_2)

 Iterations = 135001:584551
 Thinning interval  = 450
 Sample size  = 1000 

 DIC: 300.7449 

 G-structure:  ~Phylo

      post.mean  l-95% CI u-95% CI eff.samp
Phylo     2.068 6.389e-06    7.225     1000

 R-structure:  ~units

      post.mean l-95% CI u-95% CI eff.samp
units         1        1        1        0

 Location effects: Migration_ordered ~ logMass 

            post.mean l-95% CI u-95% CI eff.samp pMCMC
(Intercept)   0.88928 -1.69994  3.49007     1000 0.436
logMass      -0.08033 -0.37129  0.21982     1000 0.550

 Cutpoints: 

                                  post.mean l-95% CI u-95% CI eff.samp
cutpoint.traitMigration_ordered.1     1.052   0.6441    1.469    897.7
posterior_summary(mcmcglmm_T1_2$Sol)
               Estimate Est.Error       Q2.5     Q97.5
(Intercept)  0.88927720 1.2935980 -1.5704745 3.6731840
logMass     -0.08033062 0.1488659 -0.3928655 0.2076132
posterior_summary(mcmcglmm_T1_2$VCV)
      Estimate Est.Error       Q2.5    Q97.5
Phylo 2.068366  3.804493 0.00210611 11.97129
units 1.000000  0.000000 1.00000000  1.00000
posterior_summary(mcmcglmm_T1_3$CP)
                                  Estimate Est.Error      Q2.5    Q97.5
cutpoint.traitMigration_ordered.1 1.116292 0.2420588 0.7754895 1.677832
summary(mcmcglmm_O1_2) # 95%HPD Interval

 Iterations = 180001:779401
 Thinning interval  = 600
 Sample size  = 1000 

 DIC: 268.7424 

 G-structure:  ~Phylo

      post.mean  l-95% CI u-95% CI eff.samp
Phylo     4.133 8.171e-05    14.26     1000

 R-structure:  ~units

      post.mean l-95% CI u-95% CI eff.samp
units         1        1        1        0

 Location effects: Migration_ordered ~ logMass 

            post.mean l-95% CI u-95% CI eff.samp pMCMC
(Intercept)    1.2622  -2.7953   4.3184     1000 0.432
logMass       -0.1087  -0.5494   0.3449     1000 0.574

 Cutpoints: 

                                  post.mean l-95% CI u-95% CI eff.samp
cutpoint.traitMigration_ordered.1     1.489   0.9695    2.098     1000
posterior_summary(mcmcglmm_O1_2$VCV) # 95%CI 
      Estimate Est.Error        Q2.5    Q97.5
Phylo 4.132637   6.62053 0.004598829 19.22872
units 1.000000   0.00000 1.000000000  1.00000
posterior_summary(mcmcglmm_O1_2$Sol)
             Estimate Est.Error       Q2.5     Q97.5
(Intercept)  1.262162 1.8674403 -2.1994721 5.0542830
logMass     -0.108659 0.2176232 -0.5751705 0.3314989
c2 <- 1
res_1 <- mcmcglmm_O1_2$Sol / sqrt(1+c2) # for fixed effects
res_2 <- mcmcglmm_O1_2$VCV / (1+c2) # for variance components
res_3 <- mcmcglmm_O1_2$CP / sqrt(1+c2)
posterior_summary(res_1)
               Estimate Est.Error      Q2.5     Q97.5
(Intercept)  0.89248312 1.3204797 -1.555262 3.5739178
logMass     -0.07683353 0.1538828 -0.406707 0.2344051
posterior_summary(res_2)
      Estimate Est.Error        Q2.5    Q97.5
Phylo 2.066318  3.310265 0.002299414 9.614361
units 0.500000  0.000000 0.500000000 0.500000
posterior_summary(res_3)
                                  Estimate Est.Error      Q2.5    Q97.5
cutpoint.traitMigration_ordered.1 1.053063 0.2218614 0.7173326 1.575309
#brms
summary(brms_OT1_2)
 Family: cumulative 
  Links: mu = probit; disc = identity 
Formula: Migration_ordered ~ logMass + (1 | gr(Phylo, cov = A)) 
   Data: dat (Number of observations: 136) 
  Draws: 2 chains, each with iter = 20000; warmup = 10000; thin = 1;
         total post-warmup draws = 20000

Multilevel Hyperparameters:
~Phylo (Number of levels: 136) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     0.93      0.66     0.04     2.45 1.00     1192     3611

Regression Coefficients:
             Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept[1]    -0.85      1.07    -3.16     1.26 1.00    15424     9201
Intercept[2]     0.15      1.06    -2.03     2.33 1.00    16388    10923
logMass         -0.07      0.13    -0.34     0.19 1.00    18410    10854

Further Distributional Parameters:
     Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
disc     1.00      0.00     1.00     1.00   NA       NA       NA

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

Both models (brms and MCMCglmm) indicate body mass (logMass) does not have a statistically significant effect on the migration order.

probabilities_mcmcglmm <- calculate_probabilities(0, 0.88, 1.05)
print(probabilities_mcmcglmm)
          sedentary partially_migratory           migratory 
          0.1468591           0.2856460           0.5674949 
probabilities_brms <- calculate_probabilities(-0.85, 0.15, 0)
print(probabilities_brms)
          sedentary partially_migratory           migratory 
          0.1976625           0.3619551           0.4403823 

The coefficient of logMass is negative, but the actual results show an increase in the probability of being migratory. This is likely because the effect of logMass is not statistically significant, and due to wide credible intervals, the probabilities may shift due to sampling variability. Therefore, the results should be interpreted with caution, as they may reflect uncertainty in the data rather than a true underlying effect of logMass on migratory tendencies.

One continuous and one categorical explanatory variable model

For the model MCMCglmm the following - here, we use the gelman,prior() function for prior setting for fixed effects to ensure weakly informative prior that help with model convergence. You can find further information in Chapter 4. Nominal models.

inv_phylo <- inverseA(tree, nodes = "ALL", scale = TRUE)
prior1 <- list(B = list(mu = rep(0,4),
                        V = gelman.prior(~logMass + Habitat.Density, data = dat, scale = sqrt(1+1))), 
               R = list(V = 1, fix = 1), 
               G = list(G1 = list(V = 1, nu = 1, alpha.mu = 0, alpha.V = 10)))

system.time(
  mcmcglmm_T1_3 <- MCMCglmm(Migration_ordered ~ logMass + Habitat.Density,
               random = ~ Phylo,
               ginverse = list(Phylo = inv_phylo$Ainv),
               family = "threshold",
               data = dat,
               prior = prior1,
               nitt = 13000*60, 
               thin = 10*60, 
               burnin = 3000*60 
                  )
            )


system.time(
  mcmcglmm_O1_3 <- MCMCglmm(Migration_ordered ~ logMass + Habitat.Density,
                        random = ~ Phylo,
                        ginverse = list(Phylo = inv_phylo$Ainv),
                        family = "ordinal",
                        data = dat,
                        prior = prior1,
                        nitt = 13000*150, 
                        thin = 10*150, 
                        burnin = 3000*150 
                        )
            )

For brms,

A <- ape::vcv.phylo(tree, cor = TRUE)

default_priors3 <- default_prior(
           Migration_ordered ~ logMass + Habitat.Density + (1 | gr(Phylo, cov = A)),
                    data = dat,
                    family = cumulative(link = "probit"),
                    data2 = list(A = A)
                    )

system.time(
  brm_OT1_3 <- brm(
                formula = Migration_ordered ~ logMass + Habitat.Density + (1 | gr(Phylo, cov = A)),
                data = dat,
                family = cumulative(link = "probit"),
                data2 = list(A = A),
                prior = default_priors3,
                iter = 6500,
                warmup = 5500,
                thin = 1,
                chain = 2,
                core = 2,
                control = list(adapt_delta = 0.99)
                )  
            )
# MCMCglmm
summary(mcmcglmm_T1_3)

 Iterations = 180001:779401
 Thinning interval  = 600
 Sample size  = 1000 

 DIC: 289.3043 

 G-structure:  ~Phylo

      post.mean  l-95% CI u-95% CI eff.samp
Phylo     1.988 4.136e-07    6.227    758.7

 R-structure:  ~units

      post.mean l-95% CI u-95% CI eff.samp
units         1        1        1        0

 Location effects: Migration_ordered ~ logMass + Habitat.Density 

                         post.mean l-95% CI u-95% CI eff.samp  pMCMC    
(Intercept)                 1.3097  -1.1727   3.6674   1000.0  0.254    
logMass                    -0.2664  -0.5703   0.0479    891.2  0.058 .  
Habitat.Densityopen         1.3154   0.4916   2.0664   1054.1 <0.001 ***
Habitat.Densitysemi-open    0.3955  -0.3020   0.9998   1000.0  0.218    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

 Cutpoints: 

                                  post.mean l-95% CI u-95% CI eff.samp
cutpoint.traitMigration_ordered.1     1.116   0.6674    1.503     1022
posterior_summary(mcmcglmm_T1_3$VCV)
      Estimate Est.Error        Q2.5   Q97.5
Phylo 1.987738  4.411821 0.009463594 9.04495
units 1.000000  0.000000 1.000000000 1.00000
posterior_summary(mcmcglmm_T1_3$Sol)
                           Estimate Est.Error       Q2.5       Q97.5
(Intercept)               1.3096841 1.2767628 -1.0290056 3.809598026
logMass                  -0.2664037 0.1660313 -0.6334123 0.007099615
Habitat.Densityopen       1.3153548 0.4385812  0.6134014 2.247115510
Habitat.Densitysemi-open  0.3954618 0.3425752 -0.2200316 1.117789403
posterior_summary(mcmcglmm_T1_3$CP)
                                  Estimate Est.Error      Q2.5    Q97.5
cutpoint.traitMigration_ordered.1 1.116292 0.2420588 0.7754895 1.677832
summary(mcmcglmm_O1_3) # 95%HPD Interval

 Iterations = 450001:1948501
 Thinning interval  = 1500
 Sample size  = 1000 

 DIC: 255.0597 

 G-structure:  ~Phylo

      post.mean  l-95% CI u-95% CI eff.samp
Phylo     4.142 1.314e-06    14.07      839

 R-structure:  ~units

      post.mean l-95% CI u-95% CI eff.samp
units         1        1        1        0

 Location effects: Migration_ordered ~ logMass + Habitat.Density 

                         post.mean l-95% CI u-95% CI eff.samp  pMCMC    
(Intercept)                1.87956 -1.61645  5.74879   1000.0  0.262    
logMass                   -0.38074 -0.82066  0.06748   1112.4  0.078 .  
Habitat.Densityopen        1.85563  0.72168  3.01111    842.4 <0.001 ***
Habitat.Densitysemi-open   0.58208 -0.29301  1.48943   1000.0  0.190    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

 Cutpoints: 

                                  post.mean l-95% CI u-95% CI eff.samp
cutpoint.traitMigration_ordered.1     1.579    1.064    2.204    885.1
posterior_summary(mcmcglmm_O1_3$VCV) # 95%CI 
      Estimate Est.Error       Q2.5    Q97.5
Phylo 4.142472  9.805007 0.03628788 19.59177
units 1.000000  0.000000 1.00000000  1.00000
posterior_summary(mcmcglmm_O1_3$Sol)
                           Estimate Est.Error       Q2.5     Q97.5
(Intercept)               1.8795575 1.8603927 -1.5225125 6.0417116
logMass                  -0.3807427 0.2337957 -0.8716018 0.0461751
Habitat.Densityopen       1.8556261 0.6258944  0.8289186 3.1677050
Habitat.Densitysemi-open  0.5820800 0.4612845 -0.2342549 1.5671716
c2 <- 1
res_1 <- mcmcglmm_O1_3$Sol / sqrt(1+c2) # for fixed effects
res_2 <- mcmcglmm_O1_3$VCV / (1+c2) # for variance components
res_3 <- mcmcglmm_O1_3$CP / sqrt(1+c2)
posterior_summary(res_1)
                           Estimate Est.Error       Q2.5      Q97.5
(Intercept)               1.3290479 1.3154963 -1.0765789 4.27213525
logMass                  -0.2692257 0.1653185 -0.6163155 0.03265073
Habitat.Densityopen       1.3121258 0.4425742  0.5861339 2.23990572
Habitat.Densitysemi-open  0.4115927 0.3261774 -0.1656433 1.10815765
posterior_summary(res_2)
      Estimate Est.Error       Q2.5    Q97.5
Phylo 2.071236  4.902504 0.01814394 9.795886
units 0.500000  0.000000 0.50000000 0.500000
posterior_summary(res_3)
                                  Estimate Est.Error      Q2.5  Q97.5
cutpoint.traitMigration_ordered.1 1.116852 0.2491617 0.7783082 1.6727
# brms
summary(brms_OT1_3)
 Family: cumulative 
  Links: mu = probit; disc = identity 
Formula: Migration_ordered ~ logMass + Habitat.Density + (1 | gr(Phylo, cov = A)) 
   Data: dat (Number of observations: 136) 
  Draws: 2 chains, each with iter = 6500; warmup = 5500; thin = 1;
         total post-warmup draws = 2000

Multilevel Hyperparameters:
~Phylo (Number of levels: 136) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     0.91      0.58     0.08     2.31 1.01      231      460

Regression Coefficients:
                         Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
Intercept[1]                -1.34      1.10    -3.63     0.73 1.00     1364
Intercept[2]                -0.27      1.09    -2.45     1.85 1.00     1523
logMass                     -0.26      0.14    -0.56     0.01 1.00     1300
Habitat.Densityopen          1.25      0.38     0.59     2.08 1.00      533
Habitat.DensitysemiMopen     0.37      0.31    -0.20     1.01 1.00     1236
                         Tail_ESS
Intercept[1]                 1095
Intercept[2]                 1156
logMass                      1172
Habitat.Densityopen           941
Habitat.DensitysemiMopen      926

Further Distributional Parameters:
     Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
disc     1.00      0.00     1.00     1.00   NA       NA       NA

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

The species living in open environments have significantly higher migration levels compared to those living in dense environments. No significant difference in migration levels was observed between species living in semi-open and dense environments.logMass does not have a significant impact on migration levels, suggesting that Habitat.Density is the primary factor influencing migration levels.

probabilities_mcmcglmm <- calculate_probabilities(0, 1.116, 1.3097)
print(probabilities_mcmcglmm)
          sedentary partially_migratory           migratory 
         0.09514867          0.32805672          0.57679460 
probabilities_brms <- calculate_probabilities(-1.33, -0.26, 0)
print(probabilities_brms)
          sedentary partially_migratory           migratory 
         0.09175914          0.30567275          0.60256811 

The probability of the Migratory category is the highest, around 60%. This reflects the higher migration likelihood, especially for species living in open environments, which leads to a higher probability of being migratory.

Example 2

The second example is similar to the first one, but we will use a different dataset and different explanatory variables. The response variable is Habitat.Density (1 = Dense habitats, 2 = Semi-open habitats, 3 = Open habitats) and the explanatory variables are cTail_length (continuous) and Diet (categorical).

Intercept only model

inv_phylo <- inverseA(tree, nodes = "ALL", scale = TRUE)
prior1 <- list(R = list(V = 1, fix = 1), 
              G = list(G1 = list(V = 1, nu = 1, alpha.mu = 0, alpha.V = 10)))

system.time(
  mcmcglmm_T2_1 <- MCMCglmm(Habitat.Density ~ 1,
                        random = ~ Phylo,
                        ginverse = list(Phylo = inv_phylo$Ainv),
                        family = "threshold",
                        data = dat,
                        prior = prior1,
                        nitt = 13000*100, 
                        thin = 10*10, 
                        burnin = 3000*100 
                        )
            )

system.time(
  mcmcglmm_O2_1 <- MCMCglmm(Habitat.Density ~ 1,
                        random = ~ Phylo,
                        ginverse = list(Phylo = inv_phylo$Ainv),
                        family = "ordinal",
                        data = dat,
                        prior = prior1,
                        nitt = 13000*40, 
                        thin = 10*40, 
                        burnin = 3000*40 
                        )
            )

A <- ape::vcv.phylo(tree, corr = TRUE)

default_priors <- default_prior(
                    Habitat.Density ~ 1 + (1 | gr(Phylo, cov = A)),
                    data = dat,
                    family = cumulative(link = "probit"),
                    data2 = list(A = A)
                    )

# Fit the model
system.time(
 brms_O2_1 <- brm(
               formula = Habitat.Density ~ 1 + (1 | gr(Phylo, cov = A)),
               data = dat,
               family = cumulative(link = "probit"),
               data2 = list(A = A),
               prior = default_priors,
               iter = 19000,
               warmup =16000,
               thin = 1,
               chain = 2,
               control = list(adapt_delta = 0.99),
                )
            )
# MCMCglmm
summary(mcmcglmm_T2_1)

 Iterations = 195001:844351
 Thinning interval  = 650
 Sample size  = 1000 

 DIC: 280.497 

 G-structure:  ~Phylo

      post.mean l-95% CI u-95% CI eff.samp
Phylo     8.552    1.018    20.69    726.6

 R-structure:  ~units

      post.mean l-95% CI u-95% CI eff.samp
units         1        1        1        0

 Location effects: Habitat.Density ~ 1 

            post.mean l-95% CI u-95% CI eff.samp pMCMC
(Intercept)   -0.2226  -3.6231   2.5774     1000   0.9

 Cutpoints: 

                                post.mean l-95% CI u-95% CI eff.samp
cutpoint.traitHabitat.Density.1     1.609    1.055     2.36    829.5
posterior_summary(mcmcglmm_T2_1$VCV)
      Estimate Est.Error     Q2.5    Q97.5
Phylo 8.552458   6.42742 2.217233 24.88162
units 1.000000   0.00000 1.000000  1.00000
posterior_summary(mcmcglmm_T2_1$Sol)
              Estimate Est.Error      Q2.5    Q97.5
(Intercept) -0.2226369  1.574717 -3.545592 2.680834
summary(mcmcglmm_O2_1) # 95%HPD Interval

 Iterations = 1500001:6495001
 Thinning interval  = 5000
 Sample size  = 1000 

 DIC: 244.2456 

 G-structure:  ~Phylo

      post.mean l-95% CI u-95% CI eff.samp
Phylo     15.59    2.906    36.05      844

 R-structure:  ~units

      post.mean l-95% CI u-95% CI eff.samp
units         1        1        1        0

 Location effects: Habitat.Density ~ 1 

            post.mean l-95% CI u-95% CI eff.samp pMCMC
(Intercept)    -0.210   -3.987    4.017     1000 0.904

 Cutpoints: 

                                post.mean l-95% CI u-95% CI eff.samp
cutpoint.traitHabitat.Density.1     2.221    1.418    3.062     1000
posterior_summary(mcmcglmm_O2_1$VCV) # 95%CI 
      Estimate Est.Error     Q2.5    Q97.5
Phylo 15.58885  10.86135 3.928927 43.26684
units  1.00000   0.00000 1.000000  1.00000
posterior_summary(mcmcglmm_O2_1$Sol)
              Estimate Est.Error      Q2.5    Q97.5
(Intercept) -0.2100358  2.077207 -4.373171 3.795935
c2 <- 1
res_1 <- mcmcglmm_T2_1$Sol / sqrt(1+c2) # for fixed effects
res_2 <- mcmcglmm_T2_1$VCV / (1+c2) # for variance components
posterior_summary(res_1)
             Estimate Est.Error      Q2.5    Q97.5
(Intercept) -0.157428  1.113493 -2.507112 1.895636
posterior_summary(res_2)
      Estimate Est.Error     Q2.5    Q97.5
Phylo 4.276229   3.21371 1.108617 12.44081
units 0.500000   0.00000 0.500000  0.50000
# brms
summary(brms_OT2_1)
 Family: cumulative 
  Links: mu = probit; disc = identity 
Formula: Habitat.Density ~ 1 + (1 | gr(Phylo, cov = A)) 
   Data: Phasianidae_dat (Number of observations: 179) 
  Draws: 2 chains, each with iter = 6000; warmup = 4000; thin = 1;
         total post-warmup draws = 4000

Multilevel Hyperparameters:
~Phylo (Number of levels: 179) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     2.54      0.70     1.45     4.20 1.01      542     1179

Regression Coefficients:
             Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept[1]    -0.21      1.09    -2.40     1.93 1.00     2080     2257
Intercept[2]     1.33      1.10    -0.75     3.58 1.00     2219     2312

Further Distributional Parameters:
     Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
disc     1.00      0.00     1.00     1.00   NA       NA       NA

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

Phylogenetic signals

# MCMCglmm
phylo_signal_mcmcglmm_T2 <- ((mcmcglmm_T2_1$VCV[, "Phylo"]) / (mcmcglmm_T2_1$VCV[, "Phylo"] + 1))
phylo_signal_mcmcglmm_T2 %>% mean()
[1] 0.8578778
phylo_signal_mcmcglmm_T2 %>% quantile(probs = c(0.025, 0.5, 0.975))
     2.5%       50%     97.5% 
0.6891739 0.8711971 0.9613625 
phylo_signal_mcmcglmm_O2 <- ((mcmcglmm_O2_1$VCV[, "Phylo"]/(1+c2)) / (mcmcglmm_O2_1$VCV[, "Phylo"]/(1+c2) + 1))
phylo_signal_mcmcglmm_O2 %>% mean()
[1] 0.8513276
phylo_signal_mcmcglmm_O2 %>% quantile(probs = c(0.025,0.5,0.975))
     2.5%       50%     97.5% 
0.6626708 0.8640197 0.9558175 
# brms
phylo_signal_brms_OT2 <- brms_OT2_1 %>% as_tibble() %>% 
        dplyr::select(Sigma_phy = sd_Phylo__Intercept) %>% 
        mutate(lambda = Sigma_phy^2 / (Sigma_phy^2 + 1)) %>% 
        pull(lambda)

phylo_signal_brms_OT2 %>% mean()
[1] 0.8456712
phylo_signal_brms_OT2 %>% quantile(probs = c(0.025,0.5,0.975))
     2.5%       50%     97.5% 
0.6783932 0.8563418 0.9463123 

Summary All three models (MCMCglmm with threshold, MCMCglmm with ordinal, and brms with cumulative probit) estimate high phylogenetic signal in Habitat.Density (\(\lambda \approx 0.85\)), suggesting strong phylogenetic structuring in habitat density preferences within Phasianidae.

One continuous explanatory variable model

The model using MCMCglmm is

inv_phylo <- inverseA(tree, nodes = "ALL", scale = TRUE)
prior1 <- list(R = list(V = 1, fix = 1), 
              G = list(G1 = list(V = 1, nu = 1, alpha.mu = 0, alpha.V = 10)))

system.time(
   mcmcglmm_T2_2 <- MCMCglmm(Habitat.Density ~ logMass,
                    random = ~ Phylo,
                    ginverse = list(Phylo = inv_phylo$Ainv),
                    family = "threshold",
                    data = dat,
                    prior = prior1,
                    nitt = 13000*250, 
                    thin = 10*25, 
                    burnin = 3000*250 
                    )
            )

system.time(
   mcmcglmm_O2_2 <- MCMCglmm(Habitat.Density ~ logMass,
                        random = ~ Phylo,
                        ginverse = list(Phylo = inv_phylo$Ainv),
                        family = "ordinal",
                        data = dat,
                        prior = prior1,
                        nitt = 13000*60, 
                        thin = 10*60, 
                        burnin = 3000*60 
                        )
            )


A <- ape::vcv.phylo(tree,corr = TRUE)


default_priors2 <- default_prior(
                    Habitat.Density ~ logMass + (1 | gr(Phylo, cov = A)),
                    data = dat,
                    family = cumulative(link = "probit"),
                    data2 = list(A = A)
                    )

# Fit the model
system.time(
  brms_OT2_2 <- brm(
                    formula = Migration_ordered ~ logMass + (1 | gr(Phylo, cov = A)),
                    data = dat,
                    family = cumulative(link = "probit"),
                    data2 = list(A = A),
                    prior = default_priors2,
                    iter = 20000,
                    warmup = 10000,
                    thin = 1,
                    chain = 2,
                    core = 2,
                    control = list(adapt_delta = 0.99)
                    )
            )
# MCMCglmm
summary(mcmcglmm_T1_2)

 Iterations = 135001:584551
 Thinning interval  = 450
 Sample size  = 1000 

 DIC: 300.7449 

 G-structure:  ~Phylo

      post.mean  l-95% CI u-95% CI eff.samp
Phylo     2.068 6.389e-06    7.225     1000

 R-structure:  ~units

      post.mean l-95% CI u-95% CI eff.samp
units         1        1        1        0

 Location effects: Migration_ordered ~ logMass 

            post.mean l-95% CI u-95% CI eff.samp pMCMC
(Intercept)   0.88928 -1.69994  3.49007     1000 0.436
logMass      -0.08033 -0.37129  0.21982     1000 0.550

 Cutpoints: 

                                  post.mean l-95% CI u-95% CI eff.samp
cutpoint.traitMigration_ordered.1     1.052   0.6441    1.469    897.7
posterior_summary(mcmcglmm_T1_2$Sol)
               Estimate Est.Error       Q2.5     Q97.5
(Intercept)  0.88927720 1.2935980 -1.5704745 3.6731840
logMass     -0.08033062 0.1488659 -0.3928655 0.2076132
posterior_summary(mcmcglmm_T1_2$VCV)
      Estimate Est.Error       Q2.5    Q97.5
Phylo 2.068366  3.804493 0.00210611 11.97129
units 1.000000  0.000000 1.00000000  1.00000
summary(mcmcglmm_O1_2) # 95%HPD Interval

 Iterations = 180001:779401
 Thinning interval  = 600
 Sample size  = 1000 

 DIC: 268.7424 

 G-structure:  ~Phylo

      post.mean  l-95% CI u-95% CI eff.samp
Phylo     4.133 8.171e-05    14.26     1000

 R-structure:  ~units

      post.mean l-95% CI u-95% CI eff.samp
units         1        1        1        0

 Location effects: Migration_ordered ~ logMass 

            post.mean l-95% CI u-95% CI eff.samp pMCMC
(Intercept)    1.2622  -2.7953   4.3184     1000 0.432
logMass       -0.1087  -0.5494   0.3449     1000 0.574

 Cutpoints: 

                                  post.mean l-95% CI u-95% CI eff.samp
cutpoint.traitMigration_ordered.1     1.489   0.9695    2.098     1000
posterior_summary(mcmcglmm_O1_2$VCV) # 95%CI 
      Estimate Est.Error        Q2.5    Q97.5
Phylo 4.132637   6.62053 0.004598829 19.22872
units 1.000000   0.00000 1.000000000  1.00000
posterior_summary(mcmcglmm_O1_2$Sol)
             Estimate Est.Error       Q2.5     Q97.5
(Intercept)  1.262162 1.8674403 -2.1994721 5.0542830
logMass     -0.108659 0.2176232 -0.5751705 0.3314989
c2 <- 1
res_1 <- mcmcglmm_O1_2$Sol / sqrt(1+c2) # for fixed effects
res_2 <- mcmcglmm_O1_2$VCV / (1+c2) # for variance components
posterior_summary(res_1)
               Estimate Est.Error      Q2.5     Q97.5
(Intercept)  0.89248312 1.3204797 -1.555262 3.5739178
logMass     -0.07683353 0.1538828 -0.406707 0.2344051
posterior_summary(res_2)
      Estimate Est.Error        Q2.5    Q97.5
Phylo 2.066318  3.310265 0.002299414 9.614361
units 0.500000  0.000000 0.500000000 0.500000
#brms
summary(brms_OT2_1)
 Family: cumulative 
  Links: mu = probit; disc = identity 
Formula: Habitat.Density ~ 1 + (1 | gr(Phylo, cov = A)) 
   Data: Phasianidae_dat (Number of observations: 179) 
  Draws: 2 chains, each with iter = 6000; warmup = 4000; thin = 1;
         total post-warmup draws = 4000

Multilevel Hyperparameters:
~Phylo (Number of levels: 179) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     2.54      0.70     1.45     4.20 1.01      542     1179

Regression Coefficients:
             Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept[1]    -0.21      1.09    -2.40     1.93 1.00     2080     2257
Intercept[2]     1.33      1.10    -0.75     3.58 1.00     2219     2312

Further Distributional Parameters:
     Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
disc     1.00      0.00     1.00     1.00   NA       NA       NA

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

summary

  • No significant effect of body mass on habitat density: In all models, the posterior estimates of the slope for logMass were close to zero, with wide credible intervals that include zero. This suggests no strong evidence that body size influences habitat density preference in Phasianidae.

  • Moderate phylogenetic signal: Phylogenetic variance was estimated as non-negligible across models (e.g., \(\lambda \approx 0.67–0.80\) after transformation), indicating moderate phylogenetic structuring in habitat density even after accounting for body mass.

One continuous and one categorical explanatory variable model

inv_phylo <- inverseA(tree, nodes = "ALL", scale = TRUE)
prior1 <- list(
               R = list(V = 1, fix = 1), 
               G = list(G1 = list(V = 1, nu = 1, alpha.mu = 0, alpha.V = 10)))

system.time(
  mcmcglmm_T2_3 <- MCMCglmm(Habitat.Density ~ logMass + Habitat.Density,
               random = ~ Phylo,
               ginverse = list(Phylo = inv_phylo$Ainv),
               family = "threshold",
               data = dat,
               prior = prior1,
               nitt = 13000*150, 
               thin = 10*150, 
               burnin = 3000*150 
                  )
            )

system.time(
  mcmcglmm_O2_3 <- MCMCglmm(Habitat.Density ~ logMass + Habitat.Density,
                        random = ~ Phylo,
                        ginverse = list(Phylo = inv_phylo$Ainv),
                        family = "ordinal",
                        data = dat,
                        prior = prior1,
                        nitt = 13000*150, 
                        thin = 10*150, 
                        burnin = 3000*150 
                        )
            )


A <- ape::vcv.phylo(tree, cor = TRUE)

default_priors3 <- default_prior(
                    Habitat.Density ~ logMass + Habitat.Density + (1 | gr(Phylo, cov = A)),
                    data = dat,
                    family = cumulative(link = "probit"),
                    data2 = list(A = A)
                    )

system.time(
  brm_OT2_3 <- brm(
                formula = Habitat.Density ~ logMass + Habitat.Density + (1 | gr(Phylo, cov = A)),
                data = dat,
                family = cumulative(link = "probit"),
                data2 = list(A = A),
                prior = default_priors3,
                iter = 20000,
                warmup = 10000,
                thin = 1,
                chain = 2,
                core = 2,
                control = list(adapt_delta = 0.99)
                )  
            )
# MCMCglmm
summary(mcmcglmm_T2_3)

 Iterations = 1800001:7794001
 Thinning interval  = 6000
 Sample size  = 1000 

 DIC: 264.3581 

 G-structure:  ~Phylo

      post.mean l-95% CI u-95% CI eff.samp
Phylo     13.42    1.332    33.11    869.4

 R-structure:  ~units

      post.mean l-95% CI u-95% CI eff.samp
units         1        1        1        0

 Location effects: Habitat.Density ~ cTail_length + IsHarbivore 

             post.mean l-95% CI u-95% CI eff.samp pMCMC  
(Intercept)    -1.2636  -5.2133   2.9340     1000 0.498  
cTail_length    0.5337  -0.4854   1.6519     1000 0.302  
IsHarbivore     1.0925   0.1875   2.1084     1100 0.012 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

 Cutpoints: 

                                post.mean l-95% CI u-95% CI eff.samp
cutpoint.traitHabitat.Density.1     1.838    1.142    2.773     1000
posterior_summary(mcmcglmm_T1_3$VCV)
      Estimate Est.Error        Q2.5   Q97.5
Phylo 1.987738  4.411821 0.009463594 9.04495
units 1.000000  0.000000 1.000000000 1.00000
posterior_summary(mcmcglmm_T1_3$Sol)
                           Estimate Est.Error       Q2.5       Q97.5
(Intercept)               1.3096841 1.2767628 -1.0290056 3.809598026
logMass                  -0.2664037 0.1660313 -0.6334123 0.007099615
Habitat.Densityopen       1.3153548 0.4385812  0.6134014 2.247115510
Habitat.Densitysemi-open  0.3954618 0.3425752 -0.2200316 1.117789403
summary(mcmcglmm_O2_3) # 95%HPD Interval

 Iterations = 1800001:7794001
 Thinning interval  = 6000
 Sample size  = 1000 

 DIC: 223.0502 

 G-structure:  ~Phylo

      post.mean l-95% CI u-95% CI eff.samp
Phylo     24.41    2.961    64.89     1000

 R-structure:  ~units

      post.mean l-95% CI u-95% CI eff.samp
units         1        1        1        0

 Location effects: Habitat.Density ~ cTail_length + IsHarbivore 

             post.mean l-95% CI u-95% CI eff.samp pMCMC  
(Intercept)    -1.6861  -8.1646   3.3928   1000.0 0.520  
cTail_length    0.7124  -0.6599   2.2055    987.7 0.266  
IsHarbivore     1.5078   0.1850   2.8961    986.1 0.010 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

 Cutpoints: 

                                post.mean l-95% CI u-95% CI eff.samp
cutpoint.traitHabitat.Density.1     2.536     1.54    3.812    855.9
posterior_summary(mcmcglmm_O2_3$VCV) # 95%CI 
      Estimate Est.Error     Q2.5    Q97.5
Phylo 24.40718   24.4568 5.150294 82.38519
units  1.00000    0.0000 1.000000  1.00000
posterior_summary(mcmcglmm_O2_3$Sol)
               Estimate Est.Error       Q2.5    Q97.5
(Intercept)  -1.6861096 2.8536673 -8.1735105 3.363553
cTail_length  0.7124418 0.7462599 -0.5101511 2.467996
IsHarbivore   1.5078371 0.7204467  0.2713007 3.103070
c2 <- 1
res_1 <- mcmcglmm_O2_3$Sol / sqrt(1+c2) # for fixed effects
res_2 <- mcmcglmm_O2_3$VCV / (1+c2) # for variance components
posterior_summary(res_1)
               Estimate Est.Error       Q2.5    Q97.5
(Intercept)  -1.1922596 2.0178475 -5.7795447 2.378391
cTail_length  0.5037724 0.5276855 -0.3607313 1.745137
IsHarbivore   1.0662019 0.5094327  0.1918386 2.194202
posterior_summary(res_2)
      Estimate Est.Error     Q2.5   Q97.5
Phylo 12.20359   12.2284 2.575147 41.1926
units  0.50000    0.0000 0.500000  0.5000
# brms
summary(brms_OT2_3)
 Family: cumulative 
  Links: mu = probit; disc = identity 
Formula: Habitat.Density ~ cTail_length + IsHarbivore + (1 | gr(Phylo, cov = A)) 
   Data: Phasianidae_dat (Number of observations: 179) 
  Draws: 2 chains, each with iter = 8000; warmup = 6000; thin = 1;
         total post-warmup draws = 4000

Multilevel Hyperparameters:
~Phylo (Number of levels: 179) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     2.90      0.92     1.57     5.05 1.00      428      833

Regression Coefficients:
             Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept[1]     0.56      1.24    -1.96     3.00 1.00     1655     2098
Intercept[2]     2.25      1.28    -0.18     4.85 1.00     1532     1706
cTail_length     0.46      0.48    -0.39     1.50 1.00     1244     1370
IsHarbivore      0.98      0.45     0.18     1.96 1.00     1524     1594

Further Distributional Parameters:
     Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
disc     1.00      0.00     1.00     1.00   NA       NA       NA

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

Summary

  • Diet is positively associated with habitat: The posterior estimates of the slope for Diet were positive and significant across all models, indicating that species with a herbivorous diet tend to occupy open habitats compared to those with other diets.

  • No significant effect of body mass on habitat density: The posterior estimates of the slope for logMass were close to zero, with wide credible intervals that include zero, suggesting no strong evidence that body size influences habitat density preference in Phasianidae.

probabilities

This probabilities section uses the posterior summaries of fixed effects (linear predictors) and cut points from each of the three models (intercept-only, one predictor, two predictors) to calculate the predicted probabilities of each category in Habitat.Density.

calculate_probabilities2 <- function(cutpoint0, cutpoint1, l) {
  
  category1_prob <- pnorm(cutpoint0 - l)
  category2_prob <- pnorm(cutpoint1 - l) - pnorm(cutpoint0 - l)
  category3_prob <- 1 - pnorm(cutpoint1 - l)

  return(c(Dense_habitats = category1_prob,
           Semi_open_habitats = category2_prob,
           Open_habitats = category3_prob)
         )
  
  }
probabilities_mcmcglmm <- calculate_probabilities2(0, 1.01, 0.3178)
print(probabilities_mcmcglmm)
    Dense_habitats Semi_open_habitats      Open_habitats 
         0.3753183          0.3802758          0.2444059 
probabilities_brms <- calculate_probabilities2(-0.32, 0.66, 0)
print(probabilities_brms)
    Dense_habitats Semi_open_habitats      Open_habitats 
         0.3744842          0.3708889          0.2546269 
probabilities_mcmcglmm <- calculate_probabilities2(0, 1.052, 0.88928)
print(probabilities_mcmcglmm)
    Dense_habitats Semi_open_habitats      Open_habitats 
         0.1869263          0.3777042          0.4353694 
probabilities_brms <- calculate_probabilities2(-0.85, 0.15, 0)
print(probabilities_brms)
    Dense_habitats Semi_open_habitats      Open_habitats 
         0.1976625          0.3619551          0.4403823 
probabilities_mcmcglmm <- calculate_probabilities2(0, 1.116, 1.3097)
print(probabilities_mcmcglmm)
    Dense_habitats Semi_open_habitats      Open_habitats 
        0.09514867         0.32805672         0.57679460 
probabilities_brms <- calculate_probabilities2(-1.33, -0.26, 0)
print(probabilities_brms)
    Dense_habitats Semi_open_habitats      Open_habitats 
        0.09175914         0.30567275         0.60256811 

4. Nominal models

When the response variable has more than two categories and un-ordered, it is treated as a nominal variable.

Prior setting

Sometimes, model convergence can be challenging, particularly in discrete models. In such cases, it is time to take a step forward - changing the default prior to a custom prior (i.e. appropriately informative prior, not over-informed prior) can improve performance and convergence.

Fixed Effects

For fixed effects, custom prior adjustments are typically not necessary. Because data usually provides enough information to estimate fixed effects accurately, so the prior does not have as much effect. Fixed effect priors can be a good thing if you know some biological limits (Gaussian models - e.g. bird morphology - some values are not realistic). Also, in MCMCglmm, the gelman.prior function can be particularly useful for logit and probit models. This function allows users to define a reasonable prior covariance matrix for the fixed effects, enhancing model stability and convergence. Specifically…

  • For logit regression, where the variance of the error distribution (with mean = 0) is pi^2 / 3, the prior accounts for this variance to remain flat on the probability scale.
  • For probit regression, where the variance of the error distribution is 1, the prior is adjusted to achieve a similarly flat distribution.

By incorporating these considerations, gelman.prior avoids excessive regularisation and ensures that the prior reflects the data while remaining weakly informative, as Gelman et al. (2008) recommended.

Random effects

Convergence issues often arise with random effects because variance components are inherently more challenging to estimate than fixed effects. This is because variance components represent variability around the mean, which requires a larger sample size to be estimated precisely. In contrast, fixed effects estimate the mean (e.g. the overall mean in an intercept-only model), which is more straightforward and usually requires a smaller sample size to stabilise. In such cases, using more informative priors can help, but careful consideration is necessary to balance convergence improvement and interpret model output.

Key points

Check autocorrelation: An auto-correlation plot helps assess the mixing of the chain. It shows the relationship between current and past samples in the MCMC chain. High auto-correlation (i.e., slow decay) indicates poor mixing because the chain is highly correlated across iterations, meaning that it is not exploring the parameter space efficiently. Ideally, the auto-correlation should drop to near zero quickly, indicating that the chain is moving smoothly and independently across the parameter space. - Lag (X-axis): The interval between the current and past samples. - Autocorrelation coefficient (Y-axis): Indicates the strength of the correlation, ranging from -1 to 1. A value close to 0 (excluding lag 0) is ideal.

In the context of MCMC, it would be important to differentiate between convergence and mixing, as these two terms are often conflated, but they refer to distinct aspects of the MCMC process. Convergence refers to whether the Markov chains have reached the true posterior distribution. For MCMC to be valid, the chains should eventually explore the entire parameter space of the target distribution. In many cases, this is evaluated by running multiple chains and checking whether they all converge to the same distribution (e.g., Rhat in brms). A value of Rhat close to 1.0 suggests that the chains have converged to the same posterior distribution.

Mixing refers to how well the Markov chain explores the parameter space during the sampling process. Good mixing means that the chain moves smoothly across the parameter space, avoiding long periods of stagnation in any particular region. Poor mixing can result in the chain getting stuck in one area of the parameter space and not exploring others effectively.

In MCMCglmm, since you can run a single chain per single model, you cannot directly check convergence of one model using multiple chains (but you can compare multiple chains using multiple MCMCglmm models). Instead, auto-correlation plots can give you insights into the quality of the sampling.

Practical benefits of informative priors for random effects

Informative priors for random effects improve model convergence and reduce computation time by restricting the MCMC sampler to a smaller parameter space. Such priors can influence both the point estimates (often) and uncertainty (always) of random effects (i.e. variance and covariance). However, they typically have less impact on fixed effect estimates. This allows you to obtain fixed effect estimates more quickly while improving convergence by constraining random effects. If your primary interest lies in fixed effect estimates rather than random effects, you can take advantage of this approach.

Potential Problems with informative priors

  1. Over-constrained parameters: Excessively narrow priors can provide us with incorrect parameter estimates by limiting them to a small range.

  2. Small sample size issues: When sample sizes are small, informative priors may dominate the posterior distribution, overshadowing the data. This can lead to the model failing to capture the data’s variability and oversimplifies reality.

When you want to estimate random effects properly, you need to use uninformative prior. But as you already know, we are hard to get convergent… The things we can do are:

  • increase iteration (this means increasing the number of samples from the posterior)
  • use longer burn-in (MCMCglmm) / warm-up (brms) period to ensure convergence
  • increase thinning in MCMCglmm - While thinning can enhance convergence in MCMCglmm by reducing autocorrelation, brms generally do not need to control thinning

Examples Here is example of uninformative vs informative prior…

inv_phylo  <- inverseA(tree, nodes = "ALL", scale = TRUE)

#uninformative
prior1 <- list(
  R = list(V = (matrix(1, 2, 2) + diag(2)) / 3, fix = 1), 
  G = list(G1 = list(V = diag(2), nu = 2, 
                     alpha.mu = rep(0, 2), alpha.V =  diag(2)
  )
  )
)

system.time(
    mcmcglmm_m1_100 <- MCMCglmm(Primary.Lifestyle ~ trait -1, 
                        random = ~us(trait):Phylo,
                        rcov = ~us(trait):units,
                        ginverse = list(Phylo = inv_phylo$Ainv),
                        family = "categorical",
                        data = dat,
                        prior = prior1,
                        nitt = 13000*100, 
                        thin = 10*100, 
                        burnin = 3000*100 
                        )
            )


system.time(
    mcmcglmm_m1_1000 <- MCMCglmm(Primary.Lifestyle ~ trait -1, 
                        random = ~us(trait):Phylo,
                        rcov = ~us(trait):units,
                        ginverse = list(Phylo = inv_phylo$Ainv),
                        family = "categorical",
                        data = dat,
                        prior = prior1,
                        nitt = 13000*10000, 
                        thin = 10*10000, 
                        burnin = 3000*10000 
                        )
            )

# informative prior 
prior2 <- list(
  R = list(V = (matrix(1, 2, 2) + diag(2)) / 3, fix = 1), 
  G = list(G1 = list(V = diag(2), nu = 200, 
                     alpha.mu = rep(0, 2), alpha.V =  diag(2)
  )
  )
)

system.time(
    mcmcglmm_m2 <- MCMCglmm(Primary.Lifestyle ~ trait -1, 
                        random = ~us(trait):Phylo,
                        rcov = ~us(trait):units,
                        ginverse = list(Phylo = inv_phylo$Ainv),
                        family = "categorical",
                        data = dat,
                        prior = prior2,
                        nitt = 13000*75, 
                        thin = 10*75, 
                        burnin = 3000*75 
                        )
            )

You can see that the increasing iteration, thinning, and burn-in can promote increasing effect sampling size and improving mixing.

summary(mcmcglmm_m1_100)

 Iterations = 300001:1299001
 Thinning interval  = 1000
 Sample size  = 1000 

 DIC: 259.4007 

 G-structure:  ~us(trait):Phylo

                                                                            post.mean
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.Phylo     31.18
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.Phylo    -12.08
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.Phylo    -12.08
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.Phylo    104.23
                                                                            l-95% CI
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.Phylo    1.252
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.Phylo  -70.340
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.Phylo  -70.340
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.Phylo    1.118
                                                                            u-95% CI
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.Phylo    83.25
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.Phylo    59.37
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.Phylo    59.37
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.Phylo   408.78
                                                                            eff.samp
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.Phylo    37.85
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.Phylo    41.96
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.Phylo    41.96
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.Phylo    28.75

 R-structure:  ~us(trait):units

                                                                            post.mean
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.units    0.6667
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.units    0.3333
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.units    0.3333
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.units    0.6667
                                                                            l-95% CI
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.units   0.6667
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.units   0.3333
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.units   0.3333
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.units   0.6667
                                                                            u-95% CI
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.units   0.6667
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.units   0.3333
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.units   0.3333
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.units   0.6667
                                                                            eff.samp
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.units        0
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.units        0
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.units        0
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.units        0

 Location effects: Primary.Lifestyle ~ trait - 1 

                                   post.mean l-95% CI u-95% CI eff.samp pMCMC
traitPrimary.Lifestyle.Insessorial    -1.628  -10.272    7.019    148.9 0.662
traitPrimary.Lifestyle.Terrestrial     5.279   -7.006   19.131    125.6 0.314
summary(mcmcglmm_m1_1000)

 Iterations = 3000001:12990001
 Thinning interval  = 10000
 Sample size  = 1000 

 DIC: 273.7709 

 G-structure:  ~us(trait):Phylo

                                                                            post.mean
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.Phylo     30.88
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.Phylo    -25.45
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.Phylo    -25.45
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.Phylo     81.67
                                                                            l-95% CI
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.Phylo    1.314
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.Phylo  -94.475
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.Phylo  -94.475
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.Phylo    1.144
                                                                            u-95% CI
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.Phylo    99.34
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.Phylo    29.87
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.Phylo    29.87
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.Phylo   369.51
                                                                            eff.samp
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.Phylo   270.66
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.Phylo   218.26
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.Phylo   218.26
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.Phylo    89.38

 R-structure:  ~us(trait):units

                                                                            post.mean
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.units    0.6667
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.units    0.3333
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.units    0.3333
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.units    0.6667
                                                                            l-95% CI
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.units   0.6667
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.units   0.3333
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.units   0.3333
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.units   0.6667
                                                                            u-95% CI
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.units   0.6667
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.units   0.3333
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.units   0.3333
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.units   0.6667
                                                                            eff.samp
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.units        0
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.units        0
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.units        0
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.units        0

 Location effects: Primary.Lifestyle ~ trait - 1 

                                   post.mean l-95% CI u-95% CI eff.samp pMCMC
traitPrimary.Lifestyle.Insessorial   -1.3033  -5.3581   2.3037     1000 0.478
traitPrimary.Lifestyle.Terrestrial    0.9187  -3.7341   5.2263     1000 0.676
summary(mcmcglmm_m2)

 Iterations = 225001:974251
 Thinning interval  = 750
 Sample size  = 1000 

 DIC: 298.0892 

 G-structure:  ~us(trait):Phylo

                                                                            post.mean
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.Phylo    6.0759
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.Phylo   -0.0885
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.Phylo   -0.0885
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.Phylo    6.8348
                                                                            l-95% CI
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.Phylo   0.2940
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.Phylo  -0.9723
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.Phylo  -0.9723
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.Phylo   1.5880
                                                                            u-95% CI
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.Phylo  12.1120
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.Phylo   0.9838
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.Phylo   0.9838
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.Phylo  13.6006
                                                                            eff.samp
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.Phylo    731.3
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.Phylo   1000.0
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.Phylo   1000.0
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.Phylo    820.7

 R-structure:  ~us(trait):units

                                                                            post.mean
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.units    0.6667
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.units    0.3333
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.units    0.3333
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.units    0.6667
                                                                            l-95% CI
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.units   0.6667
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.units   0.3333
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.units   0.3333
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.units   0.6667
                                                                            u-95% CI
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.units   0.6667
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.units   0.3333
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.units   0.3333
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.units   0.6667
                                                                            eff.samp
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.units        0
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.units        0
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.units        0
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.units        0

 Location effects: Primary.Lifestyle ~ trait - 1 

                                   post.mean l-95% CI u-95% CI eff.samp pMCMC
traitPrimary.Lifestyle.Insessorial   -0.3218  -3.7935   3.0988   1000.0 0.908
traitPrimary.Lifestyle.Terrestrial    2.2780  -1.3172   5.8322    888.4 0.180
autocorr.plot(mcmcglmm_m1_100$VCV)

autocorr.plot(mcmcglmm_m1_1000$VCV)

autocorr.plot(mcmcglmm_m2$VCV)

NOTE

Comparing brms and MCMCglmm is helpful to check your model’s robustness even if convergence diagnostics (e.g. R-hat, effective sampling size) look okay with you in the package you selected.

Why? brms allows more flexibility with setting priors, including specifying distributions for hyperparameters, while MCMCglmm has less flexibility, which can sometimes lead to over-constraining the posterior.

If model results are consistent in both packages, it indicates that the findings are robust and not overly sensitive to the choice of prior settings (and sampling methods?). This consistency suggests that the model is stable and properly converged and that the posterior distributions are not heavily influenced by one software package’s specific characteristics or limitations. That means your results reflect your data :)

Explanation of dataset

Datasets for example 1

This dataset focuses on Turdidae species. A binary variable IsOmnivore based on trophic level is created.

trees  <- read.nexus(here("data", "potential", "avonet", "trees.nex"))
tree <- trees[[1]]
dat <- read.csv(here("data", "potential", "avonet", "turdidae.csv"))

# Check tree$tip.label matches dat$Scientific_name
match_result <- setequal(tree$tip.label, dat$Phylo)

# Centering for continuous variables
dat <- dat %>%
  mutate(
    log_Mass_centered = scale(log(Mass), center = TRUE, scale = FALSE),
    log_Tail_Length_centered = scale(log(Tail.Length), center = TRUE, scale = FALSE)
  )

dat <- dat %>% 
 mutate(across(c(Trophic.Level, Trophic.Niche, Primary.Lifestyle, Migration, Habitat, Species.Status), as.factor))

dat$IsOmnivore <- ifelse(dat$Trophic.Level == "Omnivore", 1, 0)

Datasets for example 2

For Example 2, data for the Sylviidae (warblers) were extracted from the AVONET dataset. The habitat categories were reclassified, migratory status was converted into a binary variable, and body mass was log-transformed and centered, resulting in a clean dataset suitable for subsequent statistical modelling.

dat <- read.csv(here("data", "bird body mass", "9993spp_clearned.csv"))
dat <- dat %>% 
  mutate(across(c(Trophic.Level, Trophic.Niche,
                  Primary.Lifestyle, Migration, Habitat, Habitat.Density, Species.Status), as.factor),
         Habitat.Density = factor(Habitat.Density, ordered = TRUE))

dat <- dat %>%
  mutate(
    Habitat_Category = case_when(
      Habitat %in% c("Desert", "Rock", "Grassland", "Shrubland") ~ "Arid_Open",
      Habitat %in% c("Woodland", "Forest") ~ "Forested_Vegetated",
      Habitat == "Human modified" ~ "Human-Modified",
      Habitat %in% c("Riverine", "Coastal", "Marine", "Wetland") ~ "Aquatic_Coastal",
      TRUE ~ "Other"  
    )
  )

Family_habitat <- dat %>% 
  group_by(Family, Habitat_Category) %>%
  tally() %>%
  group_by(Family) %>% 
  ungroup()

Sylviidae_dat <- dat %>% 
  filter(Family == "Sylviidae") %>% 
  mutate(Habitat_Category = factor(Habitat_Category, ordered = FALSE))

Sylviidae_dat <- Sylviidae_dat %>%
  mutate(IsMigrate = ifelse(Migration == "1", 1, 0)
         )
table(Sylviidae_dat$Habitat_Category)

   Aquatic_Coastal          Arid_Open Forested_Vegetated              Other 
                34                 90                168                  2 
Sylviidae_dat <- Sylviidae_dat %>%
  mutate(Habitat_Category = recode(Habitat_Category, 
                                   "Aquatic_Coastal" = "Others", 
                                   "Other" = "Others"))

Sylviidae_dat$cMass <- scale(log(Sylviidae_dat$Mass), center = TRUE, scale = FALSE)

Sylviidae_tree <- read.nexus(here("data", "bird body mass", "Sylviidae.nex"))
tree <- Sylviidae_tree[[1]]

Example 1

Intercept-only model

MCMCglmm

inv_phylo  <- inverseA(tree, nodes = "ALL", scale = TRUE)

prior5 <- list(
  R = list(V = (matrix(1, 2, 2) + diag(2)) / 3, fix = 1), 
  G = list(G1 = list(V = diag(2), nu = 20, 
                     alpha.mu = rep(0, 2), alpha.V = diag(2)*(pi^2/3))
           )
  )

 system.time(
   mcmcglmm_mn1_1 <-MCMCglmm(Primary.Lifestyle ~ trait -1, 
                      random = ~us(trait):Phylo,
                      rcov = ~us(trait):units,
                      ginverse = list(Phylo = inv_phylo$Ainv),
                      family = "categorical",
                      data = dat,
                      prior = prior5,
                      nitt = 13000*150, 
                      thin = 10*150, 
                      burnin = 3000*150
                     )
            )

brms

A <- ape::vcv.phylo(tree, corr = TRUE)
priors_brms1 <- default_prior(Primary.Lifestyle ~ 1 + (1 |a| gr(Phylo, cov = A)),
                        data = dat,
                        data2 = list(A = A),
                        family = categorical(link = "logit")
                        )

system.time(
  brms_mn1_1 <- brm(Primary.Lifestyle ~ 1 + (1 |a| gr(Phylo, cov = A)),
                data = dat,
                data2 = list(A = A),
                family = categorical(link = "logit"),
                prior = priors_brms1,
                iter = 18000, 
                warmup = 8000,
                chains = 2,
                thin = 1,
                )
            )

As well as the binary model in MCMCglmm, nominal model is need to correct the estimates obtained from MCMCglmm as MCMCglmm implements additive overdispersion GLMM fitted by MCMC sampling from the posterior distribution. Again, we can use the below code:

c2 <- (16 * sqrt(3)/(15 * pi))^2
c2a <- (16 * sqrt(3)/(15 * pi))^2*(2/3)

The results are…

summary(mcmcglmm_mn1_1)

 Iterations = 300001:1299001
 Thinning interval  = 1000
 Sample size  = 1000 

 DIC: 295.756 

 G-structure:  ~us(trait):Phylo

                                                                            post.mean
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.Phylo     7.338
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.Phylo    -1.164
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.Phylo    -1.164
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.Phylo     8.168
                                                                            l-95% CI
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.Phylo   0.8372
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.Phylo  -5.3745
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.Phylo  -5.3745
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.Phylo   1.1463
                                                                            u-95% CI
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.Phylo   15.386
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.Phylo    2.788
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.Phylo    2.788
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.Phylo   18.245
                                                                            eff.samp
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.Phylo    764.5
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.Phylo    885.3
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.Phylo    885.3
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.Phylo    615.7

 R-structure:  ~us(trait):units

                                                                            post.mean
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.units    0.6667
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.units    0.3333
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.units    0.3333
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.units    0.6667
                                                                            l-95% CI
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.units   0.6667
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.units   0.3333
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.units   0.3333
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.units   0.6667
                                                                            u-95% CI
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.units   0.6667
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.units   0.3333
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.units   0.3333
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.units   0.6667
                                                                            eff.samp
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.units        0
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.units        0
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.units        0
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.units        0

 Location effects: Primary.Lifestyle ~ trait - 1 

                                   post.mean l-95% CI u-95% CI eff.samp pMCMC
traitPrimary.Lifestyle.Insessorial   -0.4015  -4.0226   2.5627     1000 0.804
traitPrimary.Lifestyle.Terrestrial    1.6199  -1.2917   4.6445     1000 0.270
res_1 <- mcmcglmm_mn1_1$Sol / sqrt(1+c2a) # for fixed effects
res_2 <- mcmcglmm_mn1_1$VCV / (1+c2a) # for variance components
res_3_corr_phylo <- (mcmcglmm_mn1_1$VCV[, 2]/(1+c2a)) /sqrt((mcmcglmm_mn1_1$VCV[, 1] * mcmcglmm_mn1_1$VCV[, 4])/(1+c2a))

summary(res_1)

Iterations = 300001:1299001
Thinning interval = 1000 
Number of chains = 1 
Sample size per chain = 1000 

1. Empirical mean and standard deviation for each variable,
   plus standard error of the mean:

                                     Mean    SD Naive SE Time-series SE
traitPrimary.Lifestyle.Insessorial -0.362 1.480  0.04680        0.04680
traitPrimary.Lifestyle.Terrestrial  1.460 1.385  0.04379        0.04379

2. Quantiles for each variable:

                                     2.5%     25%     50%    75% 97.5%
traitPrimary.Lifestyle.Insessorial -3.475 -1.2921 -0.3067 0.5483 2.605
traitPrimary.Lifestyle.Terrestrial -1.125  0.5264  1.4485 2.3820 4.226
summary(res_2)

Iterations = 300001:1299001
Thinning interval = 1000 
Number of chains = 1 
Sample size per chain = 1000 

1. Empirical mean and standard deviation for each variable,
   plus standard error of the mean:

                                                                               Mean
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.Phylo  5.9634
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.Phylo -0.9456
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.Phylo -0.9456
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.Phylo  6.6372
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.units  0.5418
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.units  0.2709
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.units  0.2709
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.units  0.5418
                                                                               SD
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.Phylo 3.486
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.Phylo 1.643
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.Phylo 1.643
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.Phylo 4.233
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.units 0.000
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.units 0.000
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.units 0.000
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.units 0.000
                                                                            Naive SE
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.Phylo  0.11023
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.Phylo  0.05197
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.Phylo  0.05197
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.Phylo  0.13385
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.units  0.00000
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.units  0.00000
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.units  0.00000
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.units  0.00000
                                                                            Time-series SE
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.Phylo        0.12608
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.Phylo        0.05524
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.Phylo        0.05524
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.Phylo        0.17058
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.units        0.00000
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.units        0.00000
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.units        0.00000
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.units        0.00000

2. Quantiles for each variable:

                                                                               2.5%
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.Phylo  1.6277
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.Phylo -4.7030
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.Phylo -4.7030
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.Phylo  1.7250
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.units  0.5418
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.units  0.2709
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.units  0.2709
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.units  0.5418
                                                                                25%
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.Phylo  3.4946
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.Phylo -1.7201
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.Phylo -1.7201
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.Phylo  3.8512
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.units  0.5418
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.units  0.2709
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.units  0.2709
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.units  0.5418
                                                                                50%
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.Phylo  5.2197
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.Phylo -0.7662
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.Phylo -0.7662
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.Phylo  5.6452
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.units  0.5418
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.units  0.2709
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.units  0.2709
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.units  0.5418
                                                                                 75%
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.Phylo  7.52933
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.Phylo -0.06118
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.Phylo -0.06118
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.Phylo  8.20433
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.units  0.54176
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.units  0.27088
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.units  0.27088
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.units  0.54176
                                                                              97.5%
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.Phylo 14.5903
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.Phylo  2.0866
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.Phylo  2.0866
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.Phylo 17.3970
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.units  0.5418
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.units  0.2709
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.units  0.2709
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.units  0.5418
summary(res_3_corr_phylo)

Iterations = 300001:1299001
Thinning interval = 1000 
Number of chains = 1 
Sample size per chain = 1000 

1. Empirical mean and standard deviation for each variable,
   plus standard error of the mean:

          Mean             SD       Naive SE Time-series SE 
     -0.144879       0.207344       0.006557       0.006916 

2. Quantiles for each variable:

    2.5%      25%      50%      75%    97.5% 
-0.51690 -0.28714 -0.14528 -0.01081  0.26479 
summary(brms_mn1_1)
 Family: categorical 
  Links: muInsessorial = logit; muTerrestrial = logit 
Formula: Primary.Lifestyle ~ 1 + (1 | a | gr(Phylo, cov = A)) 
   Data: dat (Number of observations: 173) 
  Draws: 2 chains, each with iter = 15000; warmup = 10000; thin = 1;
         total post-warmup draws = 10000

Multilevel Hyperparameters:
~Phylo (Number of levels: 173) 
                                                     Estimate Est.Error
sd(muInsessorial_Intercept)                              4.76      2.40
sd(muTerrestrial_Intercept)                              5.76      3.37
cor(muInsessorial_Intercept,muTerrestrial_Intercept)    -0.28      0.36
                                                     l-95% CI u-95% CI Rhat
sd(muInsessorial_Intercept)                              1.96    10.81 1.00
sd(muTerrestrial_Intercept)                              2.14    14.61 1.00
cor(muInsessorial_Intercept,muTerrestrial_Intercept)    -0.88     0.46 1.00
                                                     Bulk_ESS Tail_ESS
sd(muInsessorial_Intercept)                              1550     2181
sd(muTerrestrial_Intercept)                              1215     2349
cor(muInsessorial_Intercept,muTerrestrial_Intercept)      858     1508

Regression Coefficients:
                        Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
muInsessorial_Intercept    -0.55      2.03    -4.86     3.32 1.00     4681
muTerrestrial_Intercept     1.30      2.29    -2.87     6.21 1.00     4028
                        Tail_ESS
muInsessorial_Intercept     5207
muTerrestrial_Intercept     4113

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

When comparing the results, the fixed effects estimated by both MCMCglmm and brms are similar. However, there are notable differences in the random effect variances and correlations. The variances estimated by brms show higher uncertainty, with wider 95% credible intervals, and the correlations are closer to zero. This discrepancy is likely due to the informative priors used in MCMCglmm to prioritize model convergence.

Phylogenetic signals

And phylogenetic signal is…

# average phylogenetic signal - both vs. left
## MCMCglmm
phylo_signalI_mcmcglmm <- ((mcmcglmm_mn1_1$VCV[, "traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.Phylo"]/(1+c2a)) / (mcmcglmm_mn1_1$VCV[, "traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.Phylo"]/(1+c2a) + 1))
phylo_signalI_mcmcglmm %>% mean()
[1] 0.8207548
phylo_signalI_mcmcglmm %>% quantile(probs = c(0.025,0.5,0.975))
     2.5%       50%     97.5% 
0.6194383 0.8392215 0.9358569 
## brms
phylo_signalI_brms <- brms_mn1_1 %>% as_tibble() %>% 
        dplyr::select(Sigma_phy = sd_Phylo__muInsessorial_Intercept) %>% 
        mutate(lambda_nominalI = Sigma_phy^2 / (Sigma_phy^2 + 1)) %>% 
        pull(lambda_nominalI)

phylo_signalI_brms %>% mean()
[1] 0.9330662
phylo_signalI_brms %>% quantile(probs = c(0.025,0.5,0.975))
     2.5%       50%     97.5% 
0.7940292 0.9474234 0.9915224 
# average phylogenetic signal - both vs. right
## MCMCglmm
phylo_signalT_mcmcglmm <- ((mcmcglmm_mn1_1$VCV[, "traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.Phylo"]/(1+c2a)) / (mcmcglmm_mn1_1$VCV[, "traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.Phylo"]/(1+c2a) + 1))
phylo_signalT_mcmcglmm %>% mean()
[1] 0.8336959
phylo_signalT_mcmcglmm %>% quantile(probs = c(0.025,0.5,0.975))
     2.5%       50%     97.5% 
0.6330234 0.8495150 0.9456432 
## brms
phylo_signalT_brms <- brms_mn1_1 %>% as_tibble() %>% 
        dplyr::select(Sigma_phy = sd_Phylo__muTerrestrial_Intercept) %>% 
        mutate(lambda_nominalT = Sigma_phy^2 / (Sigma_phy^2 + 1)) %>% 
        pull(lambda_nominalT)
phylo_signalT_brms %>% mean()
[1] 0.9477057
phylo_signalT_brms %>% quantile(probs = c(0.025,0.5,0.975))
     2.5%       50%     97.5% 
0.8209225 0.9606255 0.9953342 

Across both modelling frameworks, Primary.Lifestyle categories show high phylogenetic signal.

For the Insessorial category:

  • MCMCglmm estimates \(\lambda \approx 0.84\) (95% CI: 0.62–0.94)

  • brms estimates \(\lambda \approx 0.95\) (95% CI: 0.79–0.99)

For the Terrestrial category:

  • MCMCglmm estimates \(\lambda \approx 0.85\) (95% CI: 0.63–0.95)

  • brms estimates \(\lambda \approx 0.96\) (95% CI: 0.82–1.00)

This indicates a strong phylogenetic signal in the distribution of Primary.Lifestyle categories, suggesting that closely related species tend to share similar lifestyle traits.

One continuous explanatory variable model

MCMCglmm

inv_phylo  <- inverseA(tree, nodes = "ALL", scale = TRUE)

prior5 <- list(
  R = list(V = (matrix(1, 2, 2) + diag(2)) / 3, fix = 1), 
  G = list(G1 = list(V = diag(2), nu = 20, 
                     alpha.mu = rep(0, 2), alpha.V = diag(2)*(pi^2/3))
           )
  )

system.time(
    mcmcglmm_mn1_2 <- MCMCglmm(Primary.Lifestyle ~ log_Tail_Length_centered:trait + trait -1, 
                        random = ~us(trait):Phylo,
                        rcov = ~us(trait):units,
                        ginverse = list(Phylo = inv_phylo$Ainv),
                        family = "categorical",
                        data = dat,
                        prior = prior5,
                        nitt = 13000*200, 
                        thin = 10*200, 
                        burnin = 3000*200 
                        )
            )

brms

priors_brms2 <- default_prior(Primary.Lifestyle ~ log_Tail_Length_centered + (1 |a| gr(Phylo, cov = A)),
                              data = dat,
                              data2 = list(A = A),
                              family = categorical(link = "logit")
)

system.time(
  brms_m1_2 <- brm(Primary.Lifestyle ~ log_Tail_Length_centered + (1 |a| gr(Phylo, cov = A)),
                 data = dat,
                 data2 = list(A = A),
                 family = categorical(link = "logit"),
                 prior = priors_brms2,
                 iter = 10000,
                 warmup = 5000,
                 chains = 2,
                 thin = 1
  )
)

Output correction and summary

summary(mcmcglmm_mn1_2)

 Iterations = 600001:2598001
 Thinning interval  = 2000
 Sample size  = 1000 

 DIC: 279.4446 

 G-structure:  ~us(trait):Phylo

                                                                            post.mean
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.Phylo    18.854
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.Phylo    -1.169
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.Phylo    -1.169
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.Phylo    18.974
                                                                            l-95% CI
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.Phylo    1.984
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.Phylo  -10.648
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.Phylo  -10.648
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.Phylo    2.593
                                                                            u-95% CI
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.Phylo    42.79
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.Phylo     8.87
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.Phylo     8.87
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.Phylo    44.49
                                                                            eff.samp
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.Phylo    542.4
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.Phylo    784.9
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.Phylo    784.9
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.Phylo    499.9

 R-structure:  ~us(trait):units

                                                                            post.mean
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.units    0.6667
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.units    0.3333
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.units    0.3333
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.units    0.6667
                                                                            l-95% CI
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.units   0.6667
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.units   0.3333
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.units   0.3333
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.units   0.6667
                                                                            u-95% CI
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.units   0.6667
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.units   0.3333
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.units   0.3333
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.units   0.6667
                                                                            eff.samp
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.units        0
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.units        0
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.units        0
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.units        0

 Location effects: Primary.Lifestyle ~ log_Tail_Length_centered:trait + trait - 1 

                                                            post.mean l-95% CI
traitPrimary.Lifestyle.Insessorial                             -1.072   -7.345
traitPrimary.Lifestyle.Terrestrial                              3.279   -1.929
log_Tail_Length_centered:traitPrimary.Lifestyle.Insessorial    -1.187   -5.100
log_Tail_Length_centered:traitPrimary.Lifestyle.Terrestrial    -2.694   -6.748
                                                            u-95% CI eff.samp
traitPrimary.Lifestyle.Insessorial                             5.322    912.0
traitPrimary.Lifestyle.Terrestrial                             9.466    855.8
log_Tail_Length_centered:traitPrimary.Lifestyle.Insessorial    2.964    908.8
log_Tail_Length_centered:traitPrimary.Lifestyle.Terrestrial    1.571   1000.0
                                                            pMCMC
traitPrimary.Lifestyle.Insessorial                          0.744
traitPrimary.Lifestyle.Terrestrial                          0.202
log_Tail_Length_centered:traitPrimary.Lifestyle.Insessorial 0.594
log_Tail_Length_centered:traitPrimary.Lifestyle.Terrestrial 0.178
res_1 <- mcmcglmm_mn1_2$Sol / sqrt(1+c2a) # for fixed effects
res_2 <- mcmcglmm_mn1_2$VCV / (1+c2a) # for variance components
res_3_corr_phylo <- (mcmcglmm_mn1_2$VCV[, 2]/(1+c2a)) /sqrt((mcmcglmm_mn1_2$VCV[, 1] * mcmcglmm_mn1_2$VCV[, 4])/(1+c2a))

summary(res_1)

Iterations = 600001:2598001
Thinning interval = 2000 
Number of chains = 1 
Sample size per chain = 1000 

1. Empirical mean and standard deviation for each variable,
   plus standard error of the mean:

                                                               Mean    SD
traitPrimary.Lifestyle.Insessorial                          -0.9666 2.803
traitPrimary.Lifestyle.Terrestrial                           2.9557 2.670
log_Tail_Length_centered:traitPrimary.Lifestyle.Insessorial -1.0701 1.953
log_Tail_Length_centered:traitPrimary.Lifestyle.Terrestrial -2.4283 1.909
                                                            Naive SE
traitPrimary.Lifestyle.Insessorial                           0.08865
traitPrimary.Lifestyle.Terrestrial                           0.08443
log_Tail_Length_centered:traitPrimary.Lifestyle.Insessorial  0.06176
log_Tail_Length_centered:traitPrimary.Lifestyle.Terrestrial  0.06037
                                                            Time-series SE
traitPrimary.Lifestyle.Insessorial                                 0.09283
traitPrimary.Lifestyle.Terrestrial                                 0.09126
log_Tail_Length_centered:traitPrimary.Lifestyle.Insessorial        0.06479
log_Tail_Length_centered:traitPrimary.Lifestyle.Terrestrial        0.06037

2. Quantiles for each variable:

                                                              2.5%    25%
traitPrimary.Lifestyle.Insessorial                          -7.020 -2.654
traitPrimary.Lifestyle.Terrestrial                          -1.639  1.231
log_Tail_Length_centered:traitPrimary.Lifestyle.Insessorial -4.746 -2.317
log_Tail_Length_centered:traitPrimary.Lifestyle.Terrestrial -6.266 -3.592
                                                                50%     75%
traitPrimary.Lifestyle.Insessorial                          -0.8061  0.7953
traitPrimary.Lifestyle.Terrestrial                           2.6791  4.4126
log_Tail_Length_centered:traitPrimary.Lifestyle.Insessorial -1.0952  0.2483
log_Tail_Length_centered:traitPrimary.Lifestyle.Terrestrial -2.4131 -1.1127
                                                            97.5%
traitPrimary.Lifestyle.Insessorial                          4.602
traitPrimary.Lifestyle.Terrestrial                          8.724
log_Tail_Length_centered:traitPrimary.Lifestyle.Insessorial 2.557
log_Tail_Length_centered:traitPrimary.Lifestyle.Terrestrial 1.320
summary(res_2)

Iterations = 600001:2598001
Thinning interval = 2000 
Number of chains = 1 
Sample size per chain = 1000 

1. Empirical mean and standard deviation for each variable,
   plus standard error of the mean:

                                                                               Mean
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.Phylo 15.3215
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.Phylo -0.9500
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.Phylo -0.9500
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.Phylo 15.4192
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.units  0.5418
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.units  0.2709
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.units  0.2709
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.units  0.5418
                                                                                SD
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.Phylo 10.870
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.Phylo  4.038
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.Phylo  4.038
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.Phylo 10.808
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.units  0.000
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.units  0.000
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.units  0.000
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.units  0.000
                                                                            Naive SE
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.Phylo   0.3437
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.Phylo   0.1277
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.Phylo   0.1277
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.Phylo   0.3418
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.units   0.0000
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.units   0.0000
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.units   0.0000
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.units   0.0000
                                                                            Time-series SE
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.Phylo         0.4667
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.Phylo         0.1441
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.Phylo         0.1441
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.Phylo         0.4834
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.units         0.0000
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.units         0.0000
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.units         0.0000
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.units         0.0000

2. Quantiles for each variable:

                                                                               2.5%
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.Phylo  3.2225
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.Phylo -9.2289
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.Phylo -9.2289
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.Phylo  3.4482
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.units  0.5418
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.units  0.2709
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.units  0.2709
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.units  0.5418
                                                                                25%
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.Phylo  8.5847
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.Phylo -2.7137
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.Phylo -2.7137
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.Phylo  8.1572
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.units  0.5418
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.units  0.2709
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.units  0.2709
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.units  0.5418
                                                                                50%
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.Phylo 12.7689
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.Phylo -0.8335
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.Phylo -0.8335
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.Phylo 12.7788
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.units  0.5418
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.units  0.2709
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.units  0.2709
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.units  0.5418
                                                                                75%
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.Phylo 19.2764
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.Phylo  1.1495
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.Phylo  1.1495
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.Phylo 19.6677
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.units  0.5418
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.units  0.2709
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.units  0.2709
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.units  0.5418
                                                                              97.5%
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.Phylo 42.4203
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.Phylo  6.6341
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.Phylo  6.6341
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.Phylo 43.7071
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.units  0.5418
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.units  0.2709
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.units  0.2709
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.units  0.5418
summary(res_3_corr_phylo)

Iterations = 600001:2598001
Thinning interval = 2000 
Number of chains = 1 
Sample size per chain = 1000 

1. Empirical mean and standard deviation for each variable,
   plus standard error of the mean:

          Mean             SD       Naive SE Time-series SE 
     -0.064355       0.212862       0.006731       0.006731 

2. Quantiles for each variable:

    2.5%      25%      50%      75%    97.5% 
-0.46205 -0.21687 -0.07002  0.08667  0.35059 
summary(brms_mn1_2)
 Family: categorical 
  Links: muInsessorial = logit; muTerrestrial = logit 
Formula: Primary.Lifestyle ~ log_Tail_Length_centered + (1 | a | gr(Phylo, cov = A)) 
   Data: dat (Number of observations: 173) 
  Draws: 2 chains, each with iter = 15000; warmup = 10000; thin = 1;
         total post-warmup draws = 10000

Multilevel Hyperparameters:
~Phylo (Number of levels: 173) 
                                                     Estimate Est.Error
sd(muInsessorial_Intercept)                              6.14      3.83
sd(muTerrestrial_Intercept)                              6.90      4.68
cor(muInsessorial_Intercept,muTerrestrial_Intercept)    -0.15      0.40
                                                     l-95% CI u-95% CI Rhat
sd(muInsessorial_Intercept)                              2.13    15.68 1.00
sd(muTerrestrial_Intercept)                              2.32    19.89 1.00
cor(muInsessorial_Intercept,muTerrestrial_Intercept)    -0.84     0.64 1.00
                                                     Bulk_ESS Tail_ESS
sd(muInsessorial_Intercept)                              1627     2326
sd(muTerrestrial_Intercept)                              1412     1902
cor(muInsessorial_Intercept,muTerrestrial_Intercept)     1304     2835

Regression Coefficients:
                                       Estimate Est.Error l-95% CI u-95% CI
muInsessorial_Intercept                   -0.53      2.29    -5.38     3.86
muTerrestrial_Intercept                    1.34      2.45    -3.08     6.58
muInsessorial_log_Tail_Length_centered    -2.17      3.08    -9.60     2.81
muTerrestrial_log_Tail_Length_centered    -2.95      3.18    -9.99     2.58
                                       Rhat Bulk_ESS Tail_ESS
muInsessorial_Intercept                1.00    12414     6264
muTerrestrial_Intercept                1.00    10541     5718
muInsessorial_log_Tail_Length_centered 1.00     4513     3030
muTerrestrial_log_Tail_Length_centered 1.00     4134     2746

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

Both MCMCglmm and brms indicate very high phylogenetic variance for both Insessorial and Terrestrial lifestyles with only weak and uncertain phylogenetic correlation between the two categories. The differences in variance estimates likely stem from the informative priors used in MCMCglmm to aid convergence.

One continuous and one binary explanatory variables model

MCMCglmm

prior5 <- list(
  R = list(V = (matrix(1, 2, 2) + diag(2)) / 3, fix = 1), 
  G = list(G1 = list(V = diag(2), nu = 20, 
                     alpha.mu = rep(0, 2), alpha.V = diag(2)*(pi^2/3))
           )
  )

system.time(
    mcmcglmm_mn1_3 <- MCMCglmm(Primary.Lifestyle ~ log_Tail_Length_centered:trait + IsOmnivore:trait + trait -1, 
                        random = ~us(trait):Phylo,
                        rcov = ~us(trait):units,
                        ginverse = list(Phylo = inv_phylo$Ainv),
                        family = "categorical",
                        data = dat,
                        prior = prior5,
                        nitt = 13000*250, 
                        thin = 10*250, 
                        burnin = 3000*250 
                        )
            )

brms

system.time(
brms_mn1_3 <- brm(
      Primary.Lifestyle ~ log_Tail_Length_centered + IsOmnivore + (1 |a| gr(Phylo, cov = A)),
      data = dat,
      data2 = list(A = A),
      family = categorical(link = "logit"),
      prior = custom_priors,
      iter = 15000, 
      warmup = 5000,
      chains = 2,
      thin = 1
    )
)

Outputs are…

summary(mcmcglmm_mn1_3)

 Iterations = 750001:3247501
 Thinning interval  = 2500
 Sample size  = 1000 

 DIC: 244.733 

 G-structure:  ~us(trait):Phylo

                                                                            post.mean
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.Phylo    22.932
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.Phylo    -1.274
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.Phylo    -1.274
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.Phylo    15.898
                                                                              l-95% CI
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.Phylo  2.484e+00
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.Phylo -1.179e+01
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.Phylo -1.179e+01
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.Phylo  3.475e-04
                                                                            u-95% CI
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.Phylo   51.826
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.Phylo    8.518
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.Phylo    8.518
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.Phylo   44.088
                                                                            eff.samp
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.Phylo    581.7
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.Phylo    899.2
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.Phylo    899.2
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.Phylo    627.4

 R-structure:  ~us(trait):units

                                                                            post.mean
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.units    0.6667
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.units    0.3333
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.units    0.3333
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.units    0.6667
                                                                            l-95% CI
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.units   0.6667
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.units   0.3333
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.units   0.3333
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.units   0.6667
                                                                            u-95% CI
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.units   0.6667
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.units   0.3333
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.units   0.3333
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.units   0.6667
                                                                            eff.samp
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.units        0
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.units        0
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.units        0
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.units        0

 Location effects: Primary.Lifestyle ~ log_Tail_Length_centered:trait + IsOmnivore:trait + trait - 1 

                                                            post.mean l-95% CI
traitPrimary.Lifestyle.Insessorial                           -1.01465 -8.38540
traitPrimary.Lifestyle.Terrestrial                            3.98040 -1.31604
log_Tail_Length_centered:traitPrimary.Lifestyle.Insessorial  -1.23330 -6.68483
log_Tail_Length_centered:traitPrimary.Lifestyle.Terrestrial  -1.89662 -6.62854
traitPrimary.Lifestyle.Insessorial:IsOmnivore                 0.08497 -1.87839
traitPrimary.Lifestyle.Terrestrial:IsOmnivore                -5.14025 -8.12432
                                                            u-95% CI eff.samp
traitPrimary.Lifestyle.Insessorial                           5.59217    894.2
traitPrimary.Lifestyle.Terrestrial                           9.80666   1200.6
log_Tail_Length_centered:traitPrimary.Lifestyle.Insessorial  2.66615   1000.0
log_Tail_Length_centered:traitPrimary.Lifestyle.Terrestrial  2.37056   1113.2
traitPrimary.Lifestyle.Insessorial:IsOmnivore                2.24356    835.3
traitPrimary.Lifestyle.Terrestrial:IsOmnivore               -2.71908    848.1
                                                             pMCMC    
traitPrimary.Lifestyle.Insessorial                           0.818    
traitPrimary.Lifestyle.Terrestrial                           0.092 .  
log_Tail_Length_centered:traitPrimary.Lifestyle.Insessorial  0.610    
log_Tail_Length_centered:traitPrimary.Lifestyle.Terrestrial  0.406    
traitPrimary.Lifestyle.Insessorial:IsOmnivore                0.962    
traitPrimary.Lifestyle.Terrestrial:IsOmnivore               <0.001 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
res_1 <- mcmcglmm_mn1_3$Sol / sqrt(1+c2a) # for fixed effects
res_2 <- mcmcglmm_mn1_3$VCV / (1+c2a) # for variance components
res_3_corr_phylo <- (mcmcglmm_mn1_3$VCV[, 2]/(1+c2a)) /sqrt((mcmcglmm_mn1_3$VCV[, 1] * mcmcglmm_mn1_3$VCV[, 4])/(1+c2a))
res_3_corr_nonphylo <- (mcmcglmm_mn1_3$VCV[, 6]/(1+c2a)) /sqrt((mcmcglmm_mn1_3$VCV[, 5] * mcmcglmm_mn1_3$VCV[, 8])/(1+c2a))

summary(res_1)

Iterations = 750001:3247501
Thinning interval = 2500 
Number of chains = 1 
Sample size per chain = 1000 

1. Empirical mean and standard deviation for each variable,
   plus standard error of the mean:

                                                               Mean     SD
traitPrimary.Lifestyle.Insessorial                          -0.9147 3.2692
traitPrimary.Lifestyle.Terrestrial                           3.5882 2.4961
log_Tail_Length_centered:traitPrimary.Lifestyle.Insessorial -1.1118 2.1376
log_Tail_Length_centered:traitPrimary.Lifestyle.Terrestrial -1.7097 2.0720
traitPrimary.Lifestyle.Insessorial:IsOmnivore                0.0766 0.9584
traitPrimary.Lifestyle.Terrestrial:IsOmnivore               -4.6338 1.2856
                                                            Naive SE
traitPrimary.Lifestyle.Insessorial                           0.10338
traitPrimary.Lifestyle.Terrestrial                           0.07893
log_Tail_Length_centered:traitPrimary.Lifestyle.Insessorial  0.06760
log_Tail_Length_centered:traitPrimary.Lifestyle.Terrestrial  0.06552
traitPrimary.Lifestyle.Insessorial:IsOmnivore                0.03031
traitPrimary.Lifestyle.Terrestrial:IsOmnivore                0.04065
                                                            Time-series SE
traitPrimary.Lifestyle.Insessorial                                 0.10932
traitPrimary.Lifestyle.Terrestrial                                 0.07204
log_Tail_Length_centered:traitPrimary.Lifestyle.Insessorial        0.06760
log_Tail_Length_centered:traitPrimary.Lifestyle.Terrestrial        0.06210
traitPrimary.Lifestyle.Insessorial:IsOmnivore                      0.03316
traitPrimary.Lifestyle.Terrestrial:IsOmnivore                      0.04415

2. Quantiles for each variable:

                                                              2.5%     25%
traitPrimary.Lifestyle.Insessorial                          -7.874 -2.8074
traitPrimary.Lifestyle.Terrestrial                          -1.002  1.8582
log_Tail_Length_centered:traitPrimary.Lifestyle.Insessorial -5.840 -2.4495
log_Tail_Length_centered:traitPrimary.Lifestyle.Terrestrial -5.978 -3.0120
traitPrimary.Lifestyle.Insessorial:IsOmnivore               -1.675 -0.5924
traitPrimary.Lifestyle.Terrestrial:IsOmnivore               -7.561 -5.3247
                                                                 50%     75%
traitPrimary.Lifestyle.Insessorial                          -0.64157  1.1515
traitPrimary.Lifestyle.Terrestrial                           3.25656  5.0788
log_Tail_Length_centered:traitPrimary.Lifestyle.Insessorial -0.92478  0.3622
log_Tail_Length_centered:traitPrimary.Lifestyle.Terrestrial -1.64566 -0.2734
traitPrimary.Lifestyle.Insessorial:IsOmnivore                0.06417  0.7068
traitPrimary.Lifestyle.Terrestrial:IsOmnivore               -4.47341 -3.6857
                                                             97.5%
traitPrimary.Lifestyle.Insessorial                           4.945
traitPrimary.Lifestyle.Terrestrial                           9.368
log_Tail_Length_centered:traitPrimary.Lifestyle.Insessorial  2.750
log_Tail_Length_centered:traitPrimary.Lifestyle.Terrestrial  2.103
traitPrimary.Lifestyle.Insessorial:IsOmnivore                2.066
traitPrimary.Lifestyle.Terrestrial:IsOmnivore               -2.669
summary(res_2)

Iterations = 750001:3247501
Thinning interval = 2500 
Number of chains = 1 
Sample size per chain = 1000 

1. Empirical mean and standard deviation for each variable,
   plus standard error of the mean:

                                                                               Mean
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.Phylo 18.6355
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.Phylo -1.0351
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.Phylo -1.0351
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.Phylo 12.9191
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.units  0.5418
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.units  0.2709
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.units  0.2709
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.units  0.5418
                                                                                SD
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.Phylo 11.978
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.Phylo  3.971
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.Phylo  3.971
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.Phylo 11.113
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.units  0.000
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.units  0.000
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.units  0.000
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.units  0.000
                                                                            Naive SE
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.Phylo   0.3788
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.Phylo   0.1256
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.Phylo   0.1256
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.Phylo   0.3514
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.units   0.0000
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.units   0.0000
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.units   0.0000
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.units   0.0000
                                                                            Time-series SE
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.Phylo         0.4966
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.Phylo         0.1324
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.Phylo         0.1324
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.Phylo         0.4437
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.units         0.0000
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.units         0.0000
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.units         0.0000
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.units         0.0000

2. Quantiles for each variable:

                                                                                2.5%
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.Phylo   4.0308
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.Phylo -10.0779
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.Phylo -10.0779
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.Phylo   0.4373
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.units   0.5418
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.units   0.2709
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.units   0.2709
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.units   0.5418
                                                                                25%
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.Phylo 10.2520
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.Phylo -2.7629
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.Phylo -2.7629
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.Phylo  4.6612
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.units  0.5418
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.units  0.2709
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.units  0.2709
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.units  0.5418
                                                                                50%
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.Phylo 15.9642
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.Phylo -0.7674
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.Phylo -0.7674
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.Phylo 10.1930
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.units  0.5418
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.units  0.2709
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.units  0.2709
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.units  0.5418
                                                                                75%
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.Phylo 24.0021
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.Phylo  0.8149
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.Phylo  0.8149
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.Phylo 17.3835
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.units  0.5418
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.units  0.2709
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.units  0.2709
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.units  0.5418
                                                                              97.5%
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.Phylo 49.6277
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.Phylo  6.5968
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.Phylo  6.5968
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.Phylo 44.3732
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Insessorial.units  0.5418
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Insessorial.units  0.2709
traitPrimary.Lifestyle.Insessorial:traitPrimary.Lifestyle.Terrestrial.units  0.2709
traitPrimary.Lifestyle.Terrestrial:traitPrimary.Lifestyle.Terrestrial.units  0.5418
summary(res_3_corr_phylo)

Iterations = 750001:3247501
Thinning interval = 2500 
Number of chains = 1 
Sample size per chain = 1000 

1. Empirical mean and standard deviation for each variable,
   plus standard error of the mean:

          Mean             SD       Naive SE Time-series SE 
     -0.069527       0.220096       0.006960       0.006403 

2. Quantiles for each variable:

    2.5%      25%      50%      75%    97.5% 
-0.49120 -0.21345 -0.07966  0.08421  0.37013 
summary(res_3_corr_nonphylo)

Iterations = 750001:3247501
Thinning interval = 2500 
Number of chains = 1 
Sample size per chain = 1000 

1. Empirical mean and standard deviation for each variable,
   plus standard error of the mean:

          Mean             SD       Naive SE Time-series SE 
        0.4507         0.0000         0.0000         0.0000 

2. Quantiles for each variable:

  2.5%    25%    50%    75%  97.5% 
0.4507 0.4507 0.4507 0.4507 0.4507 
summary(brms_mn1_3)
 Family: categorical 
  Links: muInsessorial = logit; muTerrestrial = logit 
Formula: Primary.Lifestyle ~ log_Tail_Length_centered + IsOmnivore + (1 | a | gr(Phylo, cov = A)) 
   Data: dat (Number of observations: 173) 
  Draws: 2 chains, each with iter = 15000; warmup = 10000; thin = 1;
         total post-warmup draws = 10000

Multilevel Hyperparameters:
~Phylo (Number of levels: 173) 
                                                     Estimate Est.Error
sd(muInsessorial_Intercept)                              7.18      4.74
sd(muTerrestrial_Intercept)                              5.76      4.05
cor(muInsessorial_Intercept,muTerrestrial_Intercept)    -0.24      0.42
                                                     l-95% CI u-95% CI Rhat
sd(muInsessorial_Intercept)                              2.26    20.74 1.00
sd(muTerrestrial_Intercept)                              0.63    16.04 1.00
cor(muInsessorial_Intercept,muTerrestrial_Intercept)    -0.92     0.64 1.00
                                                     Bulk_ESS Tail_ESS
sd(muInsessorial_Intercept)                              1052     1590
sd(muTerrestrial_Intercept)                               920      963
cor(muInsessorial_Intercept,muTerrestrial_Intercept)     1248     2800

Regression Coefficients:
                                       Estimate Est.Error l-95% CI u-95% CI
muInsessorial_Intercept                   -0.96      2.39    -6.16     3.36
muTerrestrial_Intercept                    3.72      2.49     0.09     9.85
muInsessorial_log_Tail_Length_centered    -2.05      3.19    -9.66     3.14
muInsessorial_IsOmnivore                   0.87      1.76    -1.87     5.32
muTerrestrial_log_Tail_Length_centered    -1.50      2.89    -7.61     4.13
muTerrestrial_IsOmnivore                  -6.36      3.10   -14.70    -2.92
                                       Rhat Bulk_ESS Tail_ESS
muInsessorial_Intercept                1.00     3821     3835
muTerrestrial_Intercept                1.00     2250     2363
muInsessorial_log_Tail_Length_centered 1.00     2889     2273
muInsessorial_IsOmnivore               1.00     1783     1843
muTerrestrial_log_Tail_Length_centered 1.00     4115     3124
muTerrestrial_IsOmnivore               1.00     1269     1638

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
  • The effect of tail length on lifestyle probability was negative for both categories in both models, but the 95% CIs include zero, indicating no strong evidence for an effect.
  • Results from MCMCglmm and brms were broadly consistent in magnitude and direction of effects.
  • Only weak and uncertain phylogenetic correlation between the two categories.

Example 2

We use the data of habitat category on Sylviidae (Old World warbler - 294 species) as the response variable. Mass and presence or absence of migration are used as the explanatory variable.

Intercept-only model

MCMCglmm

inv_phylo  <- inverseA(tree, nodes = "ALL", scale = TRUE)

prior <- list(
  R = list(V = (matrix(1, 2, 2) + diag(2)) / 3, fix = 1), 
  G = list(G1 = list(V = diag(2), nu = 20, 
                     alpha.mu = rep(0, 2), alpha.V = diag(2)*(pi^2/3))
  )
)

mcmcglmm_mn2_1 <- MCMCglmm(Habitat_Category ~ trait -1, 
                  random = ~us(trait):Phylo,
                  rcov = ~us(trait):units,
                  ginverse = list(Phylo = inv_phylo$Ainv),
                  family = "categorical",
                  data = Sylviidae_dat,
                  prior = prior,
                  nitt = 13000*500, 
                  thin = 10*500, 
                  burnin = 3000*500 
)

brms

A <- ape::vcv.phylo(tree, corr = TRUE)
priors_brms1 <- default_prior(Habitat_Category ~ 1 + (1 |a| gr(Phylo, cov = A)),
                        data = Sylviidae_dat,
                        data2 = list(A = A),
                        family = categorical(link = "logit")
                        )

system.time(
  brms_mn2_1 <- brm(Habitat_Category ~ 1 + (1 |a| gr(Phylo, cov = A)),
                data = Sylviidae_dat,
                data2 = list(A = A),
                family = categorical(link = "logit"),
                prior = priors_brms1,
                iter = 18000, 
                warmup = 8000,
                chains = 2,
                thin = 1,
                )
            )

Before checking the output,we need to correct the results from MCMCglmm using the following equation:

c2 <- (16 * sqrt(3) / (15 * pi))^2
c2a <- c2*(2/3)
summary(mcmcglmm_mn2_1)

 Iterations = 1500001:6495001
 Thinning interval  = 5000
 Sample size  = 1000 

 DIC: 379.9416 

 G-structure:  ~us(trait):Phylo

                                                                                        post.mean
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.Phylo                       7.151
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.Phylo              2.010
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.Phylo              2.010
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.Phylo    14.710
                                                                                        l-95% CI
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.Phylo                      1.003
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.Phylo            -2.392
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.Phylo            -2.392
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.Phylo    3.210
                                                                                        u-95% CI
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.Phylo                     16.839
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.Phylo             7.668
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.Phylo             7.668
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.Phylo   29.105
                                                                                        eff.samp
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.Phylo                      854.5
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.Phylo            1000.0
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.Phylo            1000.0
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.Phylo    784.9

 R-structure:  ~us(trait):units

                                                                                        post.mean
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.units                      0.6667
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.units             0.3333
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.units             0.3333
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.units    0.6667
                                                                                        l-95% CI
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.units                     0.6667
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.units            0.3333
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.units            0.3333
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.units   0.6667
                                                                                        u-95% CI
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.units                     0.6667
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.units            0.3333
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.units            0.3333
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.units   0.6667
                                                                                        eff.samp
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.units                          0
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.units                 0
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.units                 0
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.units        0

 Location effects: Habitat_Category ~ trait - 1 

                                         post.mean l-95% CI u-95% CI eff.samp
traitHabitat_Category.Arid_Open            2.44871  0.09382  4.88885    905.8
traitHabitat_Category.Forested_Vegetated   4.56420  1.35941  7.49998    892.7
                                          pMCMC    
traitHabitat_Category.Arid_Open           0.036 *  
traitHabitat_Category.Forested_Vegetated <0.001 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
res_1 <- mcmcglmm_mn2_1$Sol / sqrt(1+c2a) # for fixed effects
res_2 <- mcmcglmm_mn2_1$VCV / (1+c2a) # for variance components
res_3_corr_phylo <- (mcmcglmm_mn2_1$VCV[, 2]/(1+c2a)) /sqrt((mcmcglmm_mn2_1$VCV[, 1] * mcmcglmm_mn2_1$VCV[, 4])/(1+c2a))

summary(res_1)

Iterations = 1500001:6495001
Thinning interval = 5000 
Number of chains = 1 
Sample size per chain = 1000 

1. Empirical mean and standard deviation for each variable,
   plus standard error of the mean:

                                          Mean    SD Naive SE Time-series SE
traitHabitat_Category.Arid_Open          2.207 1.140  0.03605        0.03788
traitHabitat_Category.Forested_Vegetated 4.114 1.417  0.04480        0.04741

2. Quantiles for each variable:

                                           2.5%   25%   50%   75% 97.5%
traitHabitat_Category.Arid_Open          0.1303 1.522 2.095 2.828 4.518
traitHabitat_Category.Forested_Vegetated 1.5750 3.173 4.004 4.958 7.352
summary(res_2)

Iterations = 1500001:6495001
Thinning interval = 5000 
Number of chains = 1 
Sample size per chain = 1000 

1. Empirical mean and standard deviation for each variable,
   plus standard error of the mean:

                                                                                           Mean
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.Phylo                    5.8115
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.Phylo           1.6331
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.Phylo           1.6331
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.Phylo 11.9537
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.units                    0.5418
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.units           0.2709
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.units           0.2709
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.units  0.5418
                                                                                           SD
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.Phylo                   4.203
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.Phylo          2.320
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.Phylo          2.320
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.Phylo 6.815
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.units                   0.000
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.units          0.000
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.units          0.000
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.units 0.000
                                                                                        Naive SE
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.Phylo                    0.13290
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.Phylo           0.07337
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.Phylo           0.07337
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.Phylo  0.21551
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.units                    0.00000
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.units           0.00000
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.units           0.00000
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.units  0.00000
                                                                                        Time-series SE
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.Phylo                          0.14377
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.Phylo                 0.07337
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.Phylo                 0.07337
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.Phylo        0.24325
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.units                          0.00000
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.units                 0.00000
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.units                 0.00000
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.units        0.00000

2. Quantiles for each variable:

                                                                                           2.5%
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.Phylo                    1.3308
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.Phylo          -1.5487
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.Phylo          -1.5487
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.Phylo  4.0126
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.units                    0.5418
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.units           0.2709
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.units           0.2709
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.units  0.5418
                                                                                           25%
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.Phylo                   3.1661
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.Phylo          0.2950
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.Phylo          0.2950
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.Phylo 7.3840
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.units                   0.5418
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.units          0.2709
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.units          0.2709
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.units 0.5418
                                                                                            50%
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.Phylo                    4.8034
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.Phylo           1.2254
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.Phylo           1.2254
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.Phylo 10.4777
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.units                    0.5418
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.units           0.2709
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.units           0.2709
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.units  0.5418
                                                                                            75%
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.Phylo                    6.9772
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.Phylo           2.4747
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.Phylo           2.4747
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.Phylo 14.8650
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.units                    0.5418
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.units           0.2709
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.units           0.2709
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.units  0.5418
                                                                                          97.5%
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.Phylo                   15.7767
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.Phylo           7.0730
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.Phylo           7.0730
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.Phylo 28.3674
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.units                    0.5418
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.units           0.2709
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.units           0.2709
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.units  0.5418
summary(res_3_corr_phylo)

Iterations = 1500001:6495001
Thinning interval = 5000 
Number of chains = 1 
Sample size per chain = 1000 

1. Empirical mean and standard deviation for each variable,
   plus standard error of the mean:

          Mean             SD       Naive SE Time-series SE 
      0.172049       0.184925       0.005848       0.005848 

2. Quantiles for each variable:

    2.5%      25%      50%      75%    97.5% 
-0.17512  0.03699  0.17474  0.30106  0.51285 
summary(brms_mn2_1)
 Family: categorical 
  Links: muAridOpen = logit; muForestedVegetated = logit 
Formula: Habitat_Category ~ 1 + (1 | a | gr(Phylo, cov = A)) 
   Data: Sylviidae_dat (Number of observations: 294) 
  Draws: 2 chains, each with iter = 18000; warmup = 8000; thin = 1;
         total post-warmup draws = 20000

Multilevel Hyperparameters:
~Phylo (Number of levels: 294) 
                                                        Estimate Est.Error
sd(muAridOpen_Intercept)                                    2.58      0.90
sd(muForestedVegetated_Intercept)                           4.14      1.33
cor(muAridOpen_Intercept,muForestedVegetated_Intercept)     0.44      0.31
                                                        l-95% CI u-95% CI Rhat
sd(muAridOpen_Intercept)                                    1.24     4.69 1.00
sd(muForestedVegetated_Intercept)                           2.20     7.19 1.00
cor(muAridOpen_Intercept,muForestedVegetated_Intercept)    -0.25     0.90 1.00
                                                        Bulk_ESS Tail_ESS
sd(muAridOpen_Intercept)                                    4049     7462
sd(muForestedVegetated_Intercept)                           2119     5431
cor(muAridOpen_Intercept,muForestedVegetated_Intercept)     1116     2854

Regression Coefficients:
                              Estimate Est.Error l-95% CI u-95% CI Rhat
muAridOpen_Intercept              2.24      1.17    -0.05     4.66 1.00
muForestedVegetated_Intercept     3.62      1.49     0.88     6.83 1.00
                              Bulk_ESS Tail_ESS
muAridOpen_Intercept              3552     6462
muForestedVegetated_Intercept    12256    10030

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

And average phylogenetic signals are…

# average phylogenteic signal - others vs. Arid and open
# MCMCglmm
phylo_signalA_mcmcglmm_nominal <- ((mcmcglmm_mn2_1$VCV[, "traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.Phylo"]/(1+c2a)) / (mcmcglmm_mn2_1$VCV[, "traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.Phylo"]/(1+c2a) + 1))
phylo_signalA_mcmcglmm_nominal %>% mean()
[1] 0.808917
phylo_signalA_mcmcglmm_nominal %>% quantile(probs = c(0.025,0.5,0.975))
     2.5%       50%     97.5% 
0.5709550 0.8276878 0.9403931 
## brms
phylo_signalA_brms <- brms_mn2_1 %>% as_tibble() %>% 
        dplyr::select(Sigma_phy = sd_Phylo__muAridOpen_Intercept) %>% 
        mutate(lambda_nominalA = Sigma_phy^2 / (Sigma_phy^2 + 1)) %>% 
        pull(lambda_nominalA)

phylo_signalA_brms %>% mean()
[1] 0.8370854
phylo_signalA_brms %>% quantile(probs = c(0.025,0.5,0.975))
     2.5%       50%     97.5% 
0.6055780 0.8563134 0.9565012 
# average phylogenteic signal - both vs. right
## MCMCglmm
phylo_signalF_mcmcglmm_nominal <- ((mcmcglmm_mn2_1$VCV[, "traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.Phylo"]/(1+c2a)) / (mcmcglmm_mn2_1$VCV[, "traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.Phylo"]/(1+c2a) + 1))
phylo_signalF_mcmcglmm_nominal %>% mean()
[1] 0.9044678
phylo_signalF_mcmcglmm_nominal %>% quantile(probs = c(0.025,0.5,0.975))
     2.5%       50%     97.5% 
0.8004989 0.9128743 0.9659487 
# brms
phylo_signalF_brms <- brms_mn2_1 %>% as_tibble() %>% 
        dplyr::select(Sigma_phy = sd_Phylo__muForestedVegetated_Intercept) %>% 
        mutate(lambda_nominalF = Sigma_phy^2 / (Sigma_phy^2 + 1)) %>% 
        pull(lambda_nominalF)

phylo_signalF_brms %>% mean()
[1] 0.9304826
phylo_signalF_brms %>% quantile(probs = c(0.025,0.5,0.975))
     2.5%       50%     97.5% 
0.8292983 0.9394892 0.9810195 

Both models detected high phylogenetic signal for Habitat_Category, especially for Forested_Vegetated and slightly lower for Arid_Open category. The phylogenetic correlation between the two habitat categories tended to be positive, but the 95% credible intervals included zero in both models, indicating that the correlation was not statistically supported.

One continuous explanatory variable model

MCMCglmm

mcmcglmm_mn2_2 <- MCMCglmm(Habitat_Category ~ cMass:trait + trait -1, 
                  random = ~us(trait):Phylo,
                  rcov = ~us(trait):units,
                  ginverse = list(Phylo = inv_phylo$Ainv),
                  family = "categorical",
                  data = Sylviidae_dat,
                  prior = prior,
                  nitt = 13000*500, 
                  thin = 10*500, 
                  burnin = 3000*500 
)

brms

A <- ape::vcv.phylo(tree, corr = TRUE)
priors_brms2 <- default_prior(Habitat_Category ~ cMass + (1 |a| gr(Phylo, cov = A)),
                        data = Sylviidae_dat,
                        data2 = list(A = A),
                        family = categorical(link = "logit")
                        )

system.time(
  brms_mn2_2 <- brm(Habitat_Category ~ cMass + (1 |a| gr(Phylo, cov = A)),
                data = Sylviidae_dat,
                data2 = list(A = A),
                family = categorical(link = "logit"),
                prior = priors_brms2,
                iter = 18000, 
                warmup = 8000,
                chains = 2,
                thin = 1,
                )
            )
summary(mcmcglmm_mn2_2)

 Iterations = 1500001:6495001
 Thinning interval  = 5000
 Sample size  = 1000 

 DIC: 369.9206 

 G-structure:  ~us(trait):Phylo

                                                                                        post.mean
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.Phylo                       7.923
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.Phylo              2.344
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.Phylo              2.344
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.Phylo    18.943
                                                                                        l-95% CI
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.Phylo                      1.119
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.Phylo            -2.666
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.Phylo            -2.666
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.Phylo    4.131
                                                                                        u-95% CI
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.Phylo                     18.446
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.Phylo             9.503
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.Phylo             9.503
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.Phylo   41.543
                                                                                        eff.samp
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.Phylo                      649.8
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.Phylo             770.0
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.Phylo             770.0
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.Phylo    745.4

 R-structure:  ~us(trait):units

                                                                                        post.mean
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.units                      0.6667
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.units             0.3333
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.units             0.3333
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.units    0.6667
                                                                                        l-95% CI
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.units                     0.6667
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.units            0.3333
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.units            0.3333
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.units   0.6667
                                                                                        u-95% CI
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.units                     0.6667
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.units            0.3333
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.units            0.3333
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.units   0.6667
                                                                                        eff.samp
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.units                          0
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.units                 0
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.units                 0
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.units        0

 Location effects: Habitat_Category ~ cMass:trait + trait - 1 

                                               post.mean l-95% CI u-95% CI
traitHabitat_Category.Arid_Open                  2.62403 -0.08498  5.40782
traitHabitat_Category.Forested_Vegetated         4.74629  1.55204  8.47272
cMass:traitHabitat_Category.Arid_Open           -0.79705 -2.34051  0.62992
cMass:traitHabitat_Category.Forested_Vegetated   1.03745 -0.87700  2.97350
                                               eff.samp pMCMC   
traitHabitat_Category.Arid_Open                  1000.0 0.048 * 
traitHabitat_Category.Forested_Vegetated         1000.0 0.002 **
cMass:traitHabitat_Category.Arid_Open            1000.0 0.294   
cMass:traitHabitat_Category.Forested_Vegetated    892.6 0.252   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
res_1 <- mcmcglmm_mn2_2$Sol / sqrt(1+c2a) # for fixed effects
res_2 <- mcmcglmm_mn2_2$VCV / (1+c2a) # for variance components
res_3_corr_phylo <- (mcmcglmm_mn2_2$VCV[, 2]/(1+c2a)) /sqrt((mcmcglmm_mn2_2$VCV[, 1] * mcmcglmm_mn2_2$VCV[, 4])/(1+c2a))

summary(res_1)

Iterations = 1500001:6495001
Thinning interval = 5000 
Number of chains = 1 
Sample size per chain = 1000 

1. Empirical mean and standard deviation for each variable,
   plus standard error of the mean:

                                                  Mean     SD Naive SE
traitHabitat_Category.Arid_Open                 2.3655 1.2243  0.03872
traitHabitat_Category.Forested_Vegetated        4.2786 1.5798  0.04996
cMass:traitHabitat_Category.Arid_Open          -0.7185 0.6901  0.02182
cMass:traitHabitat_Category.Forested_Vegetated  0.9352 0.8609  0.02722
                                               Time-series SE
traitHabitat_Category.Arid_Open                       0.03872
traitHabitat_Category.Forested_Vegetated              0.04996
cMass:traitHabitat_Category.Arid_Open                 0.02182
cMass:traitHabitat_Category.Forested_Vegetated        0.02882

2. Quantiles for each variable:

                                                   2.5%     25%     50%     75%
traitHabitat_Category.Arid_Open                 0.01923  1.6198  2.2918  3.0933
traitHabitat_Category.Forested_Vegetated        1.53683  3.1620  4.1529  5.2215
cMass:traitHabitat_Category.Arid_Open          -2.10314 -1.1670 -0.6945 -0.2558
cMass:traitHabitat_Category.Forested_Vegetated -0.66742  0.3931  0.8633  1.4720
                                                97.5%
traitHabitat_Category.Arid_Open                5.0795
traitHabitat_Category.Forested_Vegetated       7.8460
cMass:traitHabitat_Category.Arid_Open          0.5787
cMass:traitHabitat_Category.Forested_Vegetated 2.8657
summary(res_2)

Iterations = 1500001:6495001
Thinning interval = 5000 
Number of chains = 1 
Sample size per chain = 1000 

1. Empirical mean and standard deviation for each variable,
   plus standard error of the mean:

                                                                                           Mean
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.Phylo                    6.4389
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.Phylo           1.9045
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.Phylo           1.9045
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.Phylo 15.3942
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.units                    0.5418
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.units           0.2709
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.units           0.2709
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.units  0.5418
                                                                                           SD
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.Phylo                   4.320
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.Phylo          2.559
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.Phylo          2.559
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.Phylo 8.817
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.units                   0.000
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.units          0.000
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.units          0.000
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.units 0.000
                                                                                        Naive SE
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.Phylo                    0.13661
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.Phylo           0.08093
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.Phylo           0.08093
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.Phylo  0.27883
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.units                    0.00000
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.units           0.00000
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.units           0.00000
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.units  0.00000
                                                                                        Time-series SE
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.Phylo                          0.16948
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.Phylo                 0.09223
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.Phylo                 0.09223
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.Phylo        0.32297
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.units                          0.00000
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.units                 0.00000
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.units                 0.00000
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.units        0.00000

2. Quantiles for each variable:

                                                                                           2.5%
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.Phylo                    1.2723
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.Phylo          -2.0319
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.Phylo          -2.0319
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.Phylo  4.7187
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.units                    0.5418
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.units           0.2709
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.units           0.2709
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.units  0.5418
                                                                                           25%
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.Phylo                   3.4168
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.Phylo          0.3359
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.Phylo          0.3359
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.Phylo 9.3997
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.units                   0.5418
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.units          0.2709
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.units          0.2709
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.units 0.5418
                                                                                            50%
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.Phylo                    5.3290
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.Phylo           1.4980
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.Phylo           1.4980
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.Phylo 12.9514
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.units                    0.5418
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.units           0.2709
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.units           0.2709
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.units  0.5418
                                                                                            75%
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.Phylo                    8.3308
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.Phylo           3.0477
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.Phylo           3.0477
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.Phylo 18.8536
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.units                    0.5418
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.units           0.2709
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.units           0.2709
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.units  0.5418
                                                                                          97.5%
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.Phylo                   18.1859
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.Phylo           8.1388
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.Phylo           8.1388
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.Phylo 38.7921
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.units                    0.5418
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.units           0.2709
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.units           0.2709
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.units  0.5418
summary(res_3_corr_phylo)

Iterations = 1500001:6495001
Thinning interval = 5000 
Number of chains = 1 
Sample size per chain = 1000 

1. Empirical mean and standard deviation for each variable,
   plus standard error of the mean:

          Mean             SD       Naive SE Time-series SE 
      0.172506       0.195639       0.006187       0.006234 

2. Quantiles for each variable:

    2.5%      25%      50%      75%    97.5% 
-0.23534  0.04138  0.18461  0.31305  0.53443 
summary(brms_mn2_2)
 Family: categorical 
  Links: muAridOpen = logit; muForestedVegetated = logit 
Formula: Habitat_Category ~ cMass + (1 | a | gr(Phylo, cov = A)) 
   Data: Sylviidae_dat (Number of observations: 294) 
  Draws: 2 chains, each with iter = 18000; warmup = 8000; thin = 1;
         total post-warmup draws = 20000

Multilevel Hyperparameters:
~Phylo (Number of levels: 294) 
                                                        Estimate Est.Error
sd(muAridOpen_Intercept)                                    2.63      0.94
sd(muForestedVegetated_Intercept)                           5.00      1.69
cor(muAridOpen_Intercept,muForestedVegetated_Intercept)     0.48      0.31
                                                        l-95% CI u-95% CI Rhat
sd(muAridOpen_Intercept)                                    1.22     4.90 1.00
sd(muForestedVegetated_Intercept)                           2.60     9.08 1.00
cor(muAridOpen_Intercept,muForestedVegetated_Intercept)    -0.23     0.93 1.00
                                                        Bulk_ESS Tail_ESS
sd(muAridOpen_Intercept)                                    3376     6591
sd(muForestedVegetated_Intercept)                           1488     4134
cor(muAridOpen_Intercept,muForestedVegetated_Intercept)      794     1765

Regression Coefficients:
                              Estimate Est.Error l-95% CI u-95% CI Rhat
muAridOpen_Intercept              2.47      1.20     0.18     5.03 1.00
muForestedVegetated_Intercept     3.59      1.72     0.41     7.25 1.00
muAridOpen_cMass                 -0.55      0.75    -2.08     0.92 1.00
muForestedVegetated_cMass         1.29      1.08    -0.62     3.61 1.00
                              Bulk_ESS Tail_ESS
muAridOpen_Intercept              2925     5132
muForestedVegetated_Intercept    10315     9625
muAridOpen_cMass                  8518    11678
muForestedVegetated_cMass         6534     8329

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

Summary The model intercept represents the log-odds of belonging to a given habitat category when predictors are at their reference values (here, cMass = 0). For Arid/Open and Forested/Vegetated, the intercepts were positive, and in most cases the 95% credible intervals did not include zero. This means that, under baseline conditions, the probability of belonging to these two habitat categories is higher than for the reference category (Aquatic/Coastal). In other words, the model predicts that species are more likely to occur in Arid/Open or Forested/Vegetated habitats than in Aquatic/Coastal habitats under the reference conditions.

  • Mass effects: The effect of centred log body mass (cMass) was negative for Arid/Open and positive for Forested/Vegetated, but in both models the 95% credible intervals crossed zero, suggesting no clear statistical support.

  • Phylogenetic variances: Substantial phylogenetic variance was observed for both habitat categories, indicating moderate to strong phylogenetic structuring.

  • Phylogenetic correlation: The correlation between habitat categories was small and positive but highly uncertain, with 95% credible intervals spanning zero.

One continuous and one binary explanatory variables model

MCMCglmm

mcmcglmm_mn2_3 <- MCMCglmm(Habitat_Category ~ cMass:trait + IsMigrate:trait + trait -1, 
                  random = ~us(trait):Phylo,
                  rcov = ~us(trait):units,
                  ginverse = list(Phylo = inv_phylo$Ainv),
                  family = "categorical",
                  data = Sylviidae_dat,
                  prior = prior,
                  nitt = 13000*750, 
                  thin = 10*750, 
                  burnin = 3000*750 
)

brms

A <- ape::vcv.phylo(tree, corr = TRUE)
priors_brms3 <- default_prior(Habitat_Category ~ cMass + IsMigrate + (1 |a| gr(Phylo, cov = A)),
                        data = Sylviidae_dat,
                        data2 = list(A = A),
                        family = categorical(link = "logit")
                        )

system.time(
  brms_mn2_3 <- brm(Habitat_Category ~ cMass + IsMigrate + (1 |a| gr(Phylo, cov = A)),
                data = Sylviidae_dat,
                data2 = list(A = A),
                family = categorical(link = "logit"),
                prior = priors_brms3,
                iter = 18000, 
                warmup = 8000,
                chains = 2,
                thin = 1,
                )
            )
summary(mcmcglmm_mn2_3)

 Iterations = 2250001:9742501
 Thinning interval  = 7500
 Sample size  = 1000 

 DIC: 353.666 

 G-structure:  ~us(trait):Phylo

                                                                                        post.mean
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.Phylo                       9.083
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.Phylo              1.660
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.Phylo              1.660
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.Phylo    20.892
                                                                                        l-95% CI
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.Phylo                      1.196
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.Phylo            -4.574
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.Phylo            -4.574
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.Phylo    2.772
                                                                                        u-95% CI
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.Phylo                      21.74
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.Phylo              8.40
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.Phylo              8.40
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.Phylo    44.08
                                                                                        eff.samp
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.Phylo                      718.3
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.Phylo             779.0
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.Phylo             779.0
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.Phylo    678.4

 R-structure:  ~us(trait):units

                                                                                        post.mean
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.units                      0.6667
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.units             0.3333
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.units             0.3333
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.units    0.6667
                                                                                        l-95% CI
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.units                     0.6667
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.units            0.3333
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.units            0.3333
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.units   0.6667
                                                                                        u-95% CI
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.units                     0.6667
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.units            0.3333
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.units            0.3333
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.units   0.6667
                                                                                        eff.samp
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.units                          0
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.units                 0
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.units                 0
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.units        0

 Location effects: Habitat_Category ~ cMass:trait + IsMigrate:trait + trait - 1 

                                                   post.mean l-95% CI u-95% CI
traitHabitat_Category.Arid_Open                       2.3783  -0.5852   5.1494
traitHabitat_Category.Forested_Vegetated              2.4444  -0.8470   6.7465
cMass:traitHabitat_Category.Arid_Open                -0.8778  -2.6132   0.6125
cMass:traitHabitat_Category.Forested_Vegetated        0.8132  -1.2024   2.7663
traitHabitat_Category.Arid_Open:IsMigrate             0.4607  -0.9485   1.8999
traitHabitat_Category.Forested_Vegetated:IsMigrate    3.0348   1.1467   5.1125
                                                   eff.samp  pMCMC    
traitHabitat_Category.Arid_Open                       988.4  0.090 .  
traitHabitat_Category.Forested_Vegetated             1000.0  0.172    
cMass:traitHabitat_Category.Arid_Open                1588.3  0.258    
cMass:traitHabitat_Category.Forested_Vegetated       1000.0  0.424    
traitHabitat_Category.Arid_Open:IsMigrate             905.7  0.544    
traitHabitat_Category.Forested_Vegetated:IsMigrate   1000.0 <0.001 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
res_1 <- mcmcglmm_mn2_3$Sol / sqrt(1+c2a) # for fixed effects
res_2 <- mcmcglmm_mn2_3$VCV / (1+c2a) # for variance components
res_3_corr_phylo <- (mcmcglmm_mn2_3$VCV[, 2]/(1+c2a)) /sqrt((mcmcglmm_mn2_3$VCV[, 1] * mcmcglmm_mn2_3$VCV[, 4])/(1+c2a))

summary(res_1)

Iterations = 2250001:9742501
Thinning interval = 7500 
Number of chains = 1 
Sample size per chain = 1000 

1. Empirical mean and standard deviation for each variable,
   plus standard error of the mean:

                                                      Mean     SD Naive SE
traitHabitat_Category.Arid_Open                     2.1440 1.2735  0.04027
traitHabitat_Category.Forested_Vegetated            2.2035 1.7196  0.05438
cMass:traitHabitat_Category.Arid_Open              -0.7913 0.7334  0.02319
cMass:traitHabitat_Category.Forested_Vegetated      0.7331 0.9346  0.02955
traitHabitat_Category.Arid_Open:IsMigrate           0.4153 0.6657  0.02105
traitHabitat_Category.Forested_Vegetated:IsMigrate  2.7358 0.9074  0.02870
                                                   Time-series SE
traitHabitat_Category.Arid_Open                           0.04051
traitHabitat_Category.Forested_Vegetated                  0.05438
cMass:traitHabitat_Category.Arid_Open                     0.01840
cMass:traitHabitat_Category.Forested_Vegetated            0.02955
traitHabitat_Category.Arid_Open:IsMigrate                 0.02212
traitHabitat_Category.Forested_Vegetated:IsMigrate        0.02870

2. Quantiles for each variable:

                                                      2.5%      25%     50%
traitHabitat_Category.Arid_Open                    -0.4477  1.37995  2.1559
traitHabitat_Category.Forested_Vegetated           -0.9808  1.11585  2.1584
cMass:traitHabitat_Category.Arid_Open              -2.3303 -1.24354 -0.7433
cMass:traitHabitat_Category.Forested_Vegetated     -1.0181  0.08259  0.6961
traitHabitat_Category.Arid_Open:IsMigrate          -0.9151 -0.03301  0.4117
traitHabitat_Category.Forested_Vegetated:IsMigrate  1.1262  2.12920  2.6469
                                                       75%  97.5%
traitHabitat_Category.Arid_Open                     2.8690 4.8198
traitHabitat_Category.Forested_Vegetated            3.1992 5.9649
cMass:traitHabitat_Category.Arid_Open              -0.2922 0.5856
cMass:traitHabitat_Category.Forested_Vegetated      1.3172 2.7128
traitHabitat_Category.Arid_Open:IsMigrate           0.8748 1.7025
traitHabitat_Category.Forested_Vegetated:IsMigrate  3.2897 4.7233
summary(res_2)

Iterations = 2250001:9742501
Thinning interval = 7500 
Number of chains = 1 
Sample size per chain = 1000 

1. Empirical mean and standard deviation for each variable,
   plus standard error of the mean:

                                                                                           Mean
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.Phylo                    7.3814
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.Phylo           1.3490
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.Phylo           1.3490
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.Phylo 16.9777
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.units                    0.5418
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.units           0.2709
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.units           0.2709
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.units  0.5418
                                                                                            SD
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.Phylo                    5.319
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.Phylo           2.643
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.Phylo           2.643
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.Phylo 10.703
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.units                    0.000
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.units           0.000
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.units           0.000
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.units  0.000
                                                                                        Naive SE
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.Phylo                    0.16821
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.Phylo           0.08358
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.Phylo           0.08358
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.Phylo  0.33847
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.units                    0.00000
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.units           0.00000
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.units           0.00000
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.units  0.00000
                                                                                        Time-series SE
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.Phylo                          0.19848
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.Phylo                 0.09469
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.Phylo                 0.09469
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.Phylo        0.41092
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.units                          0.00000
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.units                 0.00000
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.units                 0.00000
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.units        0.00000

2. Quantiles for each variable:

                                                                                           2.5%
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.Phylo                    1.5684
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.Phylo          -3.6719
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.Phylo          -3.6719
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.Phylo  4.2987
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.units                    0.5418
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.units           0.2709
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.units           0.2709
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.units  0.5418
                                                                                            25%
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.Phylo                    3.7512
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.Phylo          -0.1351
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.Phylo          -0.1351
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.Phylo  9.7029
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.units                    0.5418
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.units           0.2709
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.units           0.2709
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.units  0.5418
                                                                                            50%
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.Phylo                    5.9438
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.Phylo           1.1143
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.Phylo           1.1143
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.Phylo 14.3953
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.units                    0.5418
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.units           0.2709
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.units           0.2709
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.units  0.5418
                                                                                            75%
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.Phylo                    9.6229
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.Phylo           2.6647
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.Phylo           2.6647
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.Phylo 21.6290
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.units                    0.5418
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.units           0.2709
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.units           0.2709
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.units  0.5418
                                                                                          97.5%
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.Phylo                   20.8510
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.Phylo           7.1611
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.Phylo           7.1611
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.Phylo 43.4114
traitHabitat_Category.Arid_Open:traitHabitat_Category.Arid_Open.units                    0.5418
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Arid_Open.units           0.2709
traitHabitat_Category.Arid_Open:traitHabitat_Category.Forested_Vegetated.units           0.2709
traitHabitat_Category.Forested_Vegetated:traitHabitat_Category.Forested_Vegetated.units  0.5418
summary(res_3_corr_phylo)

Iterations = 2250001:9742501
Thinning interval = 7500 
Number of chains = 1 
Sample size per chain = 1000 

1. Empirical mean and standard deviation for each variable,
   plus standard error of the mean:

          Mean             SD       Naive SE Time-series SE 
      0.117832       0.197752       0.006253       0.006253 

2. Quantiles for each variable:

    2.5%      25%      50%      75%    97.5% 
-0.27624 -0.01752  0.12623  0.25980  0.49079 
summary(brms_mn2_3)
 Family: categorical 
  Links: muAridOpen = logit; muForestedVegetated = logit 
Formula: Habitat_Category ~ cMass + IsMigrate + (1 | a | gr(Phylo, cov = A)) 
   Data: Sylviidae_dat (Number of observations: 294) 
  Draws: 2 chains, each with iter = 18000; warmup = 8000; thin = 1;
         total post-warmup draws = 20000

Multilevel Hyperparameters:
~Phylo (Number of levels: 294) 
                                                        Estimate Est.Error
sd(muAridOpen_Intercept)                                    2.93      1.24
sd(muForestedVegetated_Intercept)                           6.45      3.41
cor(muAridOpen_Intercept,muForestedVegetated_Intercept)     0.35      0.35
                                                        l-95% CI u-95% CI Rhat
sd(muAridOpen_Intercept)                                    1.28     5.93 1.00
sd(muForestedVegetated_Intercept)                           2.65    15.31 1.00
cor(muAridOpen_Intercept,muForestedVegetated_Intercept)    -0.42     0.90 1.00
                                                        Bulk_ESS Tail_ESS
sd(muAridOpen_Intercept)                                    2322     3372
sd(muForestedVegetated_Intercept)                           1254     1903
cor(muAridOpen_Intercept,muForestedVegetated_Intercept)      680     1883

Regression Coefficients:
                              Estimate Est.Error l-95% CI u-95% CI Rhat
muAridOpen_Intercept              2.08      1.30    -0.42     4.81 1.00
muForestedVegetated_Intercept     0.81      2.34    -4.26     5.08 1.00
muAridOpen_cMass                 -0.64      0.81    -2.26     0.92 1.00
muAridOpen_IsMigrate              0.53      0.72    -0.92     1.94 1.00
muForestedVegetated_cMass         1.44      1.38    -0.81     4.62 1.00
muForestedVegetated_IsMigrate     3.63      1.73     1.37     7.83 1.00
                              Bulk_ESS Tail_ESS
muAridOpen_Intercept              3324     5760
muForestedVegetated_Intercept     4304     3182
muAridOpen_cMass                  4954     9099
muAridOpen_IsMigrate              5983     9135
muForestedVegetated_cMass         3507     3935
muForestedVegetated_IsMigrate     2677     2363

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

Summary

  • Baseline habitat use: When migration status and body mass are not considered, Sylviidae species are somewhat more likely to be in Arid/Open or Forested/Vegetated habitats than in the reference category (Aquatic/Coastal), but the statistical uncertainty is large enough that we cannot be confident in these differences. Body mass effects: There is no clear evidence that heavier or lighter species prefer either Arid/Open or Forested/Vegetated habitats.

  • Migration effects: Migratory species are much more likely to be associated with Forested/Vegetated habitats compared to Aquatic/Coastal ones, even after accounting for body mass and phylogeny. Migration does not appear to strongly influence the likelihood of using Arid/Open habitats.

  • Phylogenetic patterns: The evolutionary tendencies for using Arid/Open vs. Forested/Vegetated habitats are only weakly related, and the data do not clearly indicate a consistent positive or negative correlation across the family’s evolutionary history.

Extension: models with multiple data point per species

Tips for simulating dataset

Unfortunately, there are few published dataset for multiple datapoints per species. Therefore, we generate 1. Gaussian, 2. binary. 3. ordinal, and 4. nominal data for this section and use for MCMCglmm and brms models.

Before moving each section, we share some tips we need to keep in mind when simulating data:

  1. Make the dataset as realistic as possible
    Simulated data should reflect the real-world context. If you are simulating biological data, consider the typical distributions, relationships, and range of values that could appear in your real dataset. You may need check published papers whether your assumption is suitable.

  2. Carefully define random effect(s) and fixed effect(s) parameters
    For generating a response variable, ensure that both fixed and random effects are specified clearly. The fixed effects could represent known, deterministic influences (like treatment groups), while random effects account for variability due to unmeasured factors (like species or individuals). When setting residuals (errors) for the Gaussian response variable, remember that the variance of the residuals should typically be smaller than the variance of the random effects. Because they represent smaller-scale noise, while random effects capture larger-scale variability. If you set the residual variance is larger, it can lead model unrealistic.

  3. Check true values (we set) and model estimates
    After generating the data and fitting your model, compare the “true” parameter values you set for the simulation with the estimates your model provides. This will help you assess the accuracy of your model and how well it can recover the true underlying parameters. It is important to check whether the model overestimates or underestimates certain effects.

  4. Ensure all created variables are properly aligned
    When simulating data, it is essential to check that all the variables are correctly aligned and correspond to each other as intended. Specifically, pay attention to the following points:

    • Matching species names with those in the phylogenetic tree
      Ensure that the species names used in your data set are consistent with those in the phylogenetic tree (if applicable). If these names don’t match, the integrity of the data could be compromised, leading to inaccurate results (or model can not run). For instance, if a species is named “SpeciesA” in the data but referred to as “SpeciesB” in the phylogeny, the data won’t be interpreted correctly.

    • Correspondence between variables
      Also check that other variables, such as individual IDs, environmental factors, gender, or group differences, are correctly matched. For example, verify that data points for each individual are correctly associated with the intended ID, and that variables like age and gender follow expected distributions. How to check: After creating your data frame, use functions, such as head(), summary(), or View() in R (you can find other functions as well) to check for any discrepancies, such as duplicated species names or missing values. This helps that your dataset is internally consistent.

  5. Other small things…

  • Read the documentation
    It may not be worth mentioning again, but always read the documentation for the functions yo are using. Understand how they work, what parameters they take, and what outputs they produce.
  • Test with small examples
    If you are using a new function, start by testing it on a small sample of data to see how it behaves before using it in your actual analysis. This helps you avoid larger mistakes when you apply it to the full dataset.
  • Stick to familiar methods
    However, stick to methods and functions you are familiar with, especially when you are starting. As you encounter issues or need more advanced features, gradually learn new functions and techniques. It is a safer approach and will help you build confidence.

Simulate datasets and run models

In this extension, We aim to create a dataset for models that take different types of response variables: Gaussian, binary, ordinal, and nominal traits. The explanatory variables are all the same - body mass and sex. All datasets were generated based on observations of 100 bird species, with each species being observed 5 times.

Gaussian

We simulated the habitat range (log-transformed) of each species as the continuous response variable. We assumed habitat range is more strongly influenced by body mass than by sex, and phylogenetic factors are expected to have a greater impact than non-phylogenetic factors.

set.seed(1234)
tree <- pbtree(n = 100, scale = 1)

# extract the correlation matrix from the phylogenetic tree
phylo_cor <- vcv(tree, corr = TRUE)

# define parameters
n_species <- 100  # number of species (clusters)
obs_per_species <- 5  # number of observations per species
n <- n_species * obs_per_species  # total number of observations

# simulate fixed effects 
## continuous variable: body mass 
mu_mass <- (log(4)+ log(2500))/2 # mean of whole 100 species body mass
mu_species_mass <- mvrnorm(n = 1, # species level body mass
                           mu = rep(mu_mass, ncol(phylo_cor)),
                           Sigma = phylo_cor*1) 

obs_mass <-  sapply(mu_species_mass, 
                   function(x) rnorm(obs_per_species, mean = x, sd = sqrt(0.1))) %>%  
                   as.vector() # each observation body mass

## binomial variable: sex
sex <- rbinom(n, 1, 0.5)

# simulate random effect parameter
## variance components
sigma_phylo <- 2 
sigma_non <- 1 # standard deviation of non-phylogenetic random effect

# phylogenetic random effect
random_effect_phylo <- mvrnorm(
  n = 1,
  mu = rep(0, ncol(phylo_cor)), 
  Sigma = sigma_phylo*phylo_cor)
phylo_effect <- rep(random_effect_phylo, each = obs_per_species)

# non-phylogenetic random effect
random_effect_non_phylo <- rnorm(n_species, sd = sqrt(sigma_non))
non_phylo_effect <- rep(random_effect_non_phylo, each = obs_per_species)

# residual
sigma_residual <- 0.2
residual <- rnorm(n, mean = 0, sd = sqrt(sigma_residual))

# define fixed effect parameter
beta_c0 <- 0 # intercept
beta_c1 <- 1.2 # coefficient for body mass
beta_c2 <- 0.1 # coefficient for sex

y <- beta_c0 + beta_c1 * obs_mass + beta_c2 * sex + phylo_effect + non_phylo_effect + residual

sim_data1 <- data.frame(
  habitat_range = y,
  mass = obs_mass,
  sex = factor(sex, labels = c("F", "M")),
  species = names(phylo_effect),
  individual_id = factor(1:n),  # create unique individual IDs for each observation
  phylo = names(phylo_effect)
  ) %>% 
  arrange(species)

Run models

Ainv <- inverseA(tree, nodes = "ALL", scale = TRUE)$Ainv
prior1_mcmcglmm <- list(R = list(V = 1, nu = 0.002), 
                        G = list(G1 = list(V = 1, nu = 1, alpha.mu = 0, alpha.V = 100),
                                 G2 = list(V = 1, nu = 1, alpha.mu = 0, alpha.V = 100)
                                 )
                        )

system.time(
  mod1_mcmcglmm <- MCMCglmm(
    habitat_range ~ 1,
    random = ~ phylo + species,        
    ginverse = list(phylo = Ainv), 
    prior = prior1_mcmcglmm,          
    family = "gaussian",    
    data = sim_data1,              
    nitt = 13000*60,       
    thin = 10*60,         
    burnin = 3000*60      
  ) 
)

system.time(
  mod1_mcmcglmm2 <- MCMCglmm(
    habitat_range ~ mass,
    random = ~ phylo + species,        
    ginverse = list(phylo = Ainv), 
    prior = prior1_mcmcglmm,          
    family = "gaussian",    
    data = sim_data1,              
    nitt = 13000*100,       
    thin = 10*100,         
    burnin = 3000*100      
  ) 
)

system.time(
  mod1_mcmcglmm3 <- MCMCglmm(
    habitat_range ~ mass + sex,
    random = ~ phylo + species,        
    ginverse = list(phylo = Ainv), 
    prior = prior1_mcmcglmm,          
    family = "gaussian",    
    data = sim_data1,              
    nitt = 13000*250,       
    thin = 10*250,         
    burnin = 3000*250      
  ) 
)

# brms

A <- ape::vcv.phylo(tree, corr = TRUE)

priors1_brms <- default_prior(
  habitat_range ~ 1 + (1 | gr(phylo, cov = A)) + (1 | species), 
  data = sim_data1, 
  data2 = list(A = A),
  family = gaussian()
)

system.time(
  mod1_brms <- brm(habitat_range ~ 1 + (1 | gr(phylo, cov = A)) + (1 | species), 
                   data = sim_data1, 
                   data2 = list(A = A),
                   family = gaussian(),
                   prior = priors1_brms,
                   iter = 8000, 
                   warmup = 6000, 
                   thin = 1,
                   chain = 2,
                   control = list(adapt_delta = 0.95),
  )
)

priors1_brms2 <- default_prior(
  habitat_range ~ mass + (1 | gr(phylo, cov = A)) + (1 | species), 
  data = sim_data1, 
  data2 = list(A = A),
  family = gaussian()
)

system.time(
  mod1_brms2 <- brm(habitat_range ~ mass + (1 | gr(phylo, cov = A)) + (1 | species), 
                   data = sim_data1, 
                   data2 = list(A = A),
                   family = gaussian(),
                   prior = priors1_brms2,
                   iter = 8000, 
                   warmup = 6000, 
                   thin = 1,
                   chain = 2,
                   control = list(adapt_delta = 0.95),
  )
)

priors1_brms3 <- default_prior(
  habitat_range ~ mass + sex + (1 | gr(phylo, cov = A)) + (1 | species), 
  data = sim_data1, 
  data2 = list(A = A),
  family = gaussian()
)

system.time(
  mod1_brms3 <- brm(habitat_range ~ mass + sex + (1 | gr(phylo, cov = A)) + (1 | species), 
                   data = sim_data1, 
                   data2 = list(A = A),
                   family = gaussian(),
                   prior = priors1_brms3,
                   iter = 8000, 
                   warmup = 6000, 
                   thin = 1,
                   chain = 2,
                   control = list(adapt_delta = 0.95),
  )
)

Results

actual mean and variance of the model is…

mean(y)
[1] 5.560533
var(random_effect_phylo)
[1] 0.9475315
var(random_effect_non_phylo)
[1] 0.8428301
var(residual)
[1] 0.1892837
autocorr.plot(mod1_mcmcglmm$Sol)

autocorr.plot(mod1_mcmcglmm$VCV)

summary(mod1_mcmcglmm)

 Iterations = 180001:779401
 Thinning interval  = 600
 Sample size  = 1000 

 DIC: 998.4984 

 G-structure:  ~phylo

      post.mean l-95% CI u-95% CI eff.samp
phylo     1.165   0.1108    2.704    881.6

               ~species

        post.mean l-95% CI u-95% CI eff.samp
species     1.036   0.5946    1.504     1323

 R-structure:  ~units

      post.mean l-95% CI u-95% CI eff.samp
units    0.3573   0.3125   0.4029     1000

 Location effects: habitat_range ~ 1 

            post.mean l-95% CI u-95% CI eff.samp  pMCMC    
(Intercept)     5.239    4.131    6.302     1000 <0.001 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
posterior_summary(mod1_mcmcglmm$Sol)
            Estimate Est.Error     Q2.5    Q97.5
(Intercept) 5.238814 0.5481793 4.036911 6.270514
posterior_summary(mod1_mcmcglmm$VCV)
         Estimate  Est.Error      Q2.5     Q97.5
phylo   1.1653905 0.78259640 0.1622860 2.9869859
species 1.0359759 0.24017414 0.6052211 1.5251053
units   0.3572733 0.02396132 0.3150877 0.4083013
summary(mod1_brms)
 Family: gaussian 
  Links: mu = identity; sigma = identity 
Formula: habitat_range ~ 1 + (1 | gr(phylo, cov = A)) + (1 | species) 
   Data: sim_data1 (Number of observations: 500) 
  Draws: 2 chains, each with iter = 8000; warmup = 6000; thin = 1;
         total post-warmup draws = 4000

Multilevel Hyperparameters:
~phylo (Number of levels: 100) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     0.99      0.36     0.33     1.76 1.00      345      722

~species (Number of levels: 100) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     1.02      0.12     0.78     1.26 1.00      661     1451

Regression Coefficients:
          Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept     5.28      0.51     4.22     6.30 1.00     4816     2515

Further Distributional Parameters:
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma     0.60      0.02     0.56     0.64 1.00     3973     2562

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

We will now verify whether the model could show estimates close to the true values from the simulated data - there are several points to check during this process. First, confirm whether the model converged by examining metrics such as the effective sampling size, autocorrelation, and R-hat values. Then, compare the true value with model point estimates and check the 95% credible intervals of point estimates. The intercept-only model does not include any independent variables. Instead, it focuses on the intercept (mean value) and variance components to explain the distribution of the dependent variable, habitat range. Looking at the results, the model appears to estimate the mean (fixed effect beta/intercept) and random effect variances close to their true values. However, the residual variance is not accurately estimated.

# MCMCglmm
phylo_signal_mod1_mcmcglmm <- ((mod1_mcmcglmm$VCV[, "phylo"]) / (mod1_mcmcglmm$VCV[, "phylo"] + mod1_mcmcglmm$VCV[, "species"] + mod1_mcmcglmm$VCV[, "units"]))
phylo_signal_mod1_mcmcglmm %>% mean()
[1] 0.4170196
phylo_signal_mod1_mcmcglmm %>% quantile(probs = c(0.025,0.5,0.975))
      2.5%        50%      97.5% 
0.08783016 0.42359454 0.73669927 
# brms
phylo_signal_mod1_brms <- mod1_brms %>% as_tibble() %>% 
        dplyr::select(Sigma_phy = sd_phylo__Intercept, Sigma_non_phy = sd_species__Intercept, Res = sigma) %>% 
        mutate(lambda_gaussian = Sigma_phy^2 / (Sigma_phy^2 + Sigma_non_phy^2 + Res^2)) %>% 
        pull(lambda_gaussian)

phylo_signal_mod1_brms %>% mean()
[1] 0.3984794
phylo_signal_mod1_brms %>% quantile(probs = c(0.025,0.5,0.975))
      2.5%        50%      97.5% 
0.06146131 0.39645518 0.74712062 

The phylogenetic signal was estimated as 0.42 (95% CI: 0.09, 0.74) using MCMCglmm and 0.40 (95% CI: 0.06, 0.75) using brms.It is important to note that we can now consider non-phylogenetic variance (here, species) more accurately, as the non-phylogenetic variance is directly estimated and is not confounded by within-species effects or other sources of variance.

summary(mod1_mcmcglmm2)

 Iterations = 300001:1299001
 Thinning interval  = 1000
 Sample size  = 1000 

 DIC: 680.8927 

 G-structure:  ~phylo

      post.mean l-95% CI u-95% CI eff.samp
phylo      2.31   0.5428    4.753    844.2

               ~species

        post.mean l-95% CI u-95% CI eff.samp
species     1.164   0.6272     1.72    867.3

 R-structure:  ~units

      post.mean l-95% CI u-95% CI eff.samp
units    0.1881   0.1618   0.2127     1000

 Location effects: habitat_range ~ mass 

            post.mean l-95% CI u-95% CI eff.samp  pMCMC    
(Intercept)   -0.2339  -1.7632   1.4823     1000  0.774    
mass           1.1143   0.9890   1.2515     1000 <0.001 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
posterior_summary(mod1_mcmcglmm2$Sol)
             Estimate  Est.Error       Q2.5    Q97.5
(Intercept) -0.233917 0.86362777 -1.8194889 1.430111
mass         1.114301 0.06819815  0.9859776 1.250126
posterior_summary(mod1_mcmcglmm2$VCV)
        Estimate Est.Error      Q2.5     Q97.5
phylo   2.309659 1.1768520 0.7238491 5.2748480
species 1.163885 0.2830942 0.6511874 1.7726979
units   0.188124 0.0129251 0.1628494 0.2151048
summary(mod1_brms2)
 Family: gaussian 
  Links: mu = identity; sigma = identity 
Formula: habitat_range ~ mass + (1 | gr(phylo, cov = A)) + (1 | species) 
   Data: sim_data1 (Number of observations: 500) 
  Draws: 2 chains, each with iter = 8000; warmup = 6000; thin = 1;
         total post-warmup draws = 4000

Multilevel Hyperparameters:
~phylo (Number of levels: 100) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     1.44      0.34     0.84     2.18 1.00      563     1266

~species (Number of levels: 100) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     1.07      0.13     0.82     1.34 1.00      851     1427

Regression Coefficients:
          Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept    -0.15      0.80    -1.73     1.45 1.00     3901     2789
mass          1.11      0.07     0.98     1.25 1.00     6559     2347

Further Distributional Parameters:
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma     0.43      0.02     0.40     0.47 1.00     3843     2959

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

Next, regarding the model that includes log-transformed body mass as an explanatory variable, the results showed that, as we had set(beta_c1), mass was estimated to have a positive effect on habitat range. Additionally, examining the residuals reveals that they are much closer to the true value compared to the intercept-only model. This improvement suggests that including an explanatory variable helps the model capture more of the variability in the data, reducing the unexplained variance (residual variance).

summary(mod1_mcmcglmm3)

 Iterations = 750001:3247501
 Thinning interval  = 2500
 Sample size  = 1000 

 DIC: 668.5273 

 G-structure:  ~phylo

      post.mean l-95% CI u-95% CI eff.samp
phylo     2.325   0.5523    4.646     1073

               ~species

        post.mean l-95% CI u-95% CI eff.samp
species     1.151   0.6569    1.764    900.3

 R-structure:  ~units

      post.mean l-95% CI u-95% CI eff.samp
units    0.1832   0.1572   0.2097     1000

 Location effects: habitat_range ~ mass + sex 

            post.mean l-95% CI u-95% CI eff.samp  pMCMC    
(Intercept)  -0.28323 -1.97048  1.38771     1000  0.674    
mass          1.11239  0.97750  1.24672     1000 <0.001 ***
sexM          0.15016  0.06697  0.22862     1000 <0.001 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
posterior_summary(mod1_mcmcglmm3$Sol)
              Estimate  Est.Error        Q2.5     Q97.5
(Intercept) -0.2832264 0.85695461 -1.93569141 1.4800298
mass         1.1123899 0.06844718  0.97159841 1.2423309
sexM         0.1501593 0.04167755  0.06778142 0.2286585
posterior_summary(mod1_mcmcglmm3$VCV)
         Estimate  Est.Error      Q2.5     Q97.5
phylo   2.3252270 1.16940589 0.7405934 5.2388026
species 1.1512006 0.29343601 0.6567145 1.7615075
units   0.1831836 0.01342064 0.1586928 0.2117723
summary(mod1_brms3)
 Family: gaussian 
  Links: mu = identity; sigma = identity 
Formula: habitat_range ~ mass + sex + (1 | gr(phylo, cov = A)) + (1 | species) 
   Data: sim_data1 (Number of observations: 500) 
  Draws: 2 chains, each with iter = 8000; warmup = 6000; thin = 1;
         total post-warmup draws = 4000

Multilevel Hyperparameters:
~phylo (Number of levels: 100) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     1.43      0.34     0.84     2.14 1.00      619     1255

~species (Number of levels: 100) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     1.08      0.13     0.83     1.33 1.00      964     1578

Regression Coefficients:
          Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept    -0.22      0.77    -1.76     1.26 1.00     3957     2850
mass          1.11      0.07     0.98     1.24 1.00     8492     2851
sexM          0.15      0.04     0.06     0.23 1.00     8621     2798

Further Distributional Parameters:
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma     0.43      0.02     0.40     0.46 1.00     4112     2962

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

Finally, let’s take a look at the results when both mass and sex are included in the model. We assumed that males are, on average, 0.1 units larger than females. This assumption is successfully reflected in the model estimates

Binary

The response variable in this analysis is breeding success, defined as whether a species reproduced or not. The assumptions are as follows: phylogeny has a weak relationship with breeding success, while some non-phylogenetic factors, such as environmental conditions, are likely to play a role. Additionally, heavier species are assumed to have a higher likelihood of breeding success, and females (coded as 0) are expected to succeed in breeding more often than males.

set.seed(456)

tree2 <- pbtree(n = 100, scale = 1)

# extract the correlation matrix from the phylogenetic tree
phylo_cor <- vcv(tree2, corr = TRUE)

# define parameters
n_species <- 100  # number of species (clusters)
obs_per_species <- 5  # number of observations per species
n <- n_species * obs_per_species  # total number of observations

# simulate fixed effects 
## continuous variable: body mass 
mu_mass <- (log(4)+ log(2500))/2 # mean of whole 100 species body mass
mu_species_mass <- mvrnorm(n = 1, # species level body mass
                           mu = rep(mu_mass, ncol(phylo_cor)),
                           Sigma = phylo_cor*1) 

obs_mass <- sapply(mu_species_mass, 
                   function(x) rnorm(obs_per_species, mean = x, sd = sqrt(0.2)))  %>%  
                   as.vector() # each observation body mass

## binomial variable: sex
sex <- rbinom(n, 1, 0.5)

# simulate random effect parameter
## variance components
sigma_phylo <- 0.4
sigma_non <- 0.3 # standard deviation of non-phylogenetic random effect

## phylogenetic random effect
random_effect_phylo <- mvrnorm(
  n = 1, 
  mu = rep(0, ncol(phylo_cor)), 
  Sigma = sigma_phylo*phylo_cor)
phylo_effect <- rep(random_effect_phylo, each = obs_per_species)

## non-phylogenetic random effect
random_effect_non_phylo <- rnorm(n, sd = sqrt(sigma_non))
non_phylo_effect <- rep(random_effect_non_phylo, each = obs_per_species)

# define fixed effect parameter
beta_b0 <- -0.5 # intercept 
beta_b1 <- 0.4 # coefficient for body mass  
beta_b2 <- -1.5 # coefficient for sex 

# simulate linear predictor
eta1 <- beta_b0 + beta_b1 * obs_mass + beta_b2 * sex + phylo_effect + non_phylo_effect
# eta1 <- beta_b0  + phylo_effect + non_phylo_effect

# probability
p1 <- 1 / (1 + exp(-eta1))
y1 <- rbinom(n, 1, p1)

# combine simulated data
sim_data2 <- data.frame(
    breeding = y1,
    mass = obs_mass,
    sex = factor(sex, labels = c("F", "M")),
    species = names(phylo_effect),
    individual_id = factor(1:n),  # create unique individual IDs for each observation
    phylo = names(phylo_effect)
  ) %>% 
  arrange(species) %>% 
  as.data.frame()

head(sim_data2, 20)
   breeding     mass sex species individual_id phylo
1         0 4.425580   M      t1           146    t1
2         0 4.518592   M      t1           147    t1
3         0 4.553237   M      t1           148    t1
4         1 4.728218   F      t1           149    t1
5         1 4.125475   F      t1           150    t1
6         1 4.181977   M     t10           251   t10
7         1 4.366759   F     t10           252   t10
8         0 4.286815   F     t10           253   t10
9         0 3.998315   M     t10           254   t10
10        1 3.877700   F     t10           255   t10
11        1 5.367269   M    t100           311  t100
12        0 4.287155   M    t100           312  t100
13        0 4.715682   F    t100           313  t100
14        1 4.969713   M    t100           314  t100
15        1 4.818693   F    t100           315  t100
16        0 3.750907   M     t11           256   t11
17        1 4.016580   F     t11           257   t11
18        1 3.722079   F     t11           258   t11
19        1 3.325677   M     t11           259   t11
20        1 3.100769   F     t11           260   t11

Run models

# MCMCglmm

inv_phylo <- inverseA(tree2, nodes = "ALL", scale = TRUE)
prior2_mcmcglmm <- list(R = list(V = 1, fix = 1),
                        G = list(G1 = list(V = 1, nu = 1, alpha.mu = 0, alpha.V = 10),
                                 G2 = list(V = 1, nu = 1, alpha.mu = 0, alpha.V = 10)
                        )
)

system.time(
  mod2_mcmcglmm_logit <- MCMCglmm(breeding ~ 1, 
                                  random = ~ phylo + species, 
                                  family = "categorical", 
                                  data = sim_data2, 
                                  prior = prior2_mcmcglmm, 
                                  ginverse = list(phylo = inv_phylo$Ainv),
                                  nitt = 13000*10, 
                                  thin = 10*10, 
                                  burnin = 3000*10)                  
)

system.time(
  mod2_mcmcglmm_logit2 <- MCMCglmm(breeding ~ mass, 
                                  random = ~ phylo + species, 
                                  family = "categorical", 
                                  data = sim_data2, 
                                  prior = prior2_mcmcglmm, 
                                  ginverse = list(phylo = inv_phylo$Ainv),
                                  nitt = 13000*20, 
                                  thin = 10*20, 
                                  burnin = 3000*20)                  
)


system.time(
  mod2_mcmcglmm_logit3 <- MCMCglmm(breeding ~ mass + sex, 
                                  random = ~ phylo + species, 
                                  family = "categorical", 
                                  data = sim_data2, 
                                  prior = prior2_mcmcglmm, 
                                  ginverse = list(phylo = inv_phylo$Ainv),
                                  nitt = 13000*20, 
                                  thin = 10*20, 
                                  burnin = 3000*20)                  
)

# brms

A <- ape::vcv.phylo(tree2, corr = TRUE)

priors2_brms <- default_prior(
  breeding ~ 1 + (1 | gr(phylo, cov = A)) + (1 | species), 
  data = sim_data2, 
  data2 = list(A = A),
  family = bernoulli(link = "logit")
)

system.time(
  mod2_brms_logit <- brm(breeding ~ 1 + (1 | gr(phylo, cov = A)) + (1 | species), 
                         data = sim_data2, 
                         data2 = list(A = A),
                         family = bernoulli(link = "logit"),
                         prior = priors2_brms,
                         iter = 8000, 
                         warmup = 6000, 
                         thin = 1,
                         chain = 2,
                         control = list(adapt_delta = 0.95),
  )
)

priors2_brms2 <- default_prior(
  breeding ~ mass + (1 | gr(phylo, cov = A)) + (1 | species), 
  data = sim_data2, 
  data2 = list(A = A),
  family = bernoulli(link = "logit")
)

system.time(
  mod2_brms_logit2 <- brm(breeding ~ mass + (1 | gr(phylo, cov = A)) + (1 | species), 
                         data = sim_data2, 
                         data2 = list(A = A),
                         family = bernoulli(link = "logit"),
                         prior = priors2_brms2,
                         iter = 8000, 
                         warmup = 6000, 
                         thin = 1,
                         chain = 2,
                         control = list(adapt_delta = 0.95),
  )
)

priors2_brms3 <- default_prior(
  breeding ~ mass + sex + (1 | gr(phylo, cov = A)) + (1 | species), 
  data = sim_data2, 
  data2 = list(A = A),
  family = bernoulli(link = "logit")
)

system.time(
  mod2_brms_logit3 <- brm(breeding ~ mass + sex + (1 | gr(phylo, cov = A)) + (1 | species), 
                         data = sim_data2, 
                         data2 = list(A = A),
                         family = bernoulli(link = "logit"),
                         prior = priors2_brms3,
                         iter = 8000, 
                         warmup = 6000, 
                         thin = 1,
                         chain = 2,
                         control = list(adapt_delta = 0.95),
  )
)

Results

True values are here:

# actual mean and variance of y
mean(eta1)
[1] 1.191048
var(eta1)
[1] 1.277676
var(random_effect_phylo)
[1] 0.2981068
var(random_effect_non_phylo)
[1] 0.3090161

Before checking the output,we need to correct the results from MCMCglmm using the following equation:

c2 <- (16 * sqrt(3) / (15 * pi))^2
summary(mod2_mcmcglmm_logit)

 Iterations = 30001:129901
 Thinning interval  = 100
 Sample size  = 1000 

 DIC: 550.715 

 G-structure:  ~phylo

      post.mean l-95% CI u-95% CI eff.samp
phylo    0.9548 8.97e-06    2.355    639.4

               ~species

        post.mean  l-95% CI u-95% CI eff.samp
species    0.4156 3.523e-08    1.201    486.4

 R-structure:  ~units

      post.mean l-95% CI u-95% CI eff.samp
units         1        1        1        0

 Location effects: breeding ~ 1 

            post.mean l-95% CI u-95% CI eff.samp pMCMC  
(Intercept)    1.1605   0.1331   2.1428     1272 0.034 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
res_1 <- mod2_mcmcglmm_logit$Sol / sqrt(1+c2) # for fixed effects
res_2 <- mod2_mcmcglmm_logit$VCV / (1+c2) # for variance components

summary(res_1)

Iterations = 30001:129901
Thinning interval = 100 
Number of chains = 1 
Sample size per chain = 1000 

1. Empirical mean and standard deviation for each variable,
   plus standard error of the mean:

          Mean             SD       Naive SE Time-series SE 
       1.00036        0.43343        0.01371        0.01215 

2. Quantiles for each variable:

  2.5%    25%    50%    75%  97.5% 
0.1144 0.7425 1.0093 1.2709 1.8407 
summary(res_2)

Iterations = 30001:129901
Thinning interval = 100 
Number of chains = 1 
Sample size per chain = 1000 

1. Empirical mean and standard deviation for each variable,
   plus standard error of the mean:

          Mean     SD Naive SE Time-series SE
phylo   0.7094 0.5174 0.016361        0.02046
species 0.3088 0.2845 0.008996        0.01290
units   0.7430 0.0000 0.000000        0.00000

2. Quantiles for each variable:

           2.5%    25%    50%    75% 97.5%
phylo   0.02672 0.3361 0.5973 0.9889 2.040
species 0.00126 0.0810 0.2530 0.4560 1.028
units   0.74303 0.7430 0.7430 0.7430 0.743
summary(mod2_brms_logit)
 Family: bernoulli 
  Links: mu = logit 
Formula: breeding ~ 1 + (1 | gr(phylo, cov = A)) + (1 | species) 
   Data: sim_data2 (Number of observations: 500) 
  Draws: 2 chains, each with iter = 8000; warmup = 6000; thin = 1;
         total post-warmup draws = 4000

Multilevel Hyperparameters:
~phylo (Number of levels: 100) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     0.77      0.31     0.18     1.41 1.01      542      820

~species (Number of levels: 100) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     0.51      0.25     0.03     1.00 1.00      475      924

Regression Coefficients:
          Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept     0.99      0.42     0.13     1.81 1.00     1407     1848

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

The fixed effect estimates in both MCMCglmm and brms are close to the true values, but the phylogenetic random effects are not performing well. However, the wide 95% confidence intervals indicate that they have very large uncertainties.

And phylogenetic signal is…

# MCMCglmm
phylo_signal_mod2_mcmcglmm <- ((mod2_mcmcglmm_logit$VCV[, "phylo"]/(1+c2)) / (mod2_mcmcglmm_logit$VCV[, "phylo"]/(1+c2) + mod2_mcmcglmm_logit$VCV[, "species"]/(1+c2) + 1))
phylo_signal_mod2_mcmcglmm %>% mean()
[1] 0.3233245
phylo_signal_mod2_mcmcglmm %>% quantile(probs = c(0.025,0.5,0.975))
      2.5%        50%      97.5% 
0.01696071 0.32167758 0.64318606 
# brms
phylo_signal_mod2_brms <- mod2_brms_logit %>% as_tibble() %>% 
        dplyr::select(Sigma_phy = sd_phylo__Intercept, Sigma_non_phy = sd_species__Intercept) %>% 
        mutate(lambda_binary = Sigma_phy^2 / (Sigma_phy^2 + Sigma_non_phy^2 +1)) %>% 
        pull(lambda_binary)

phylo_signal_mod2_brms %>% mean()
[1] 0.3117823
phylo_signal_mod2_brms %>% quantile(probs = c(0.025,0.5,0.975))
      2.5%        50%      97.5% 
0.01973947 0.30714261 0.63302530 

The estimated phylogenetic signal values were similar in both models.

summary(mod2_mcmcglmm_logit2)

 Iterations = 60001:259801
 Thinning interval  = 200
 Sample size  = 1000 

 DIC: 543.8511 

 G-structure:  ~phylo

      post.mean  l-95% CI u-95% CI eff.samp
phylo     1.002 2.289e-06    2.566    972.4

               ~species

        post.mean  l-95% CI u-95% CI eff.samp
species    0.5078 4.507e-06     1.34    868.1

 R-structure:  ~units

      post.mean l-95% CI u-95% CI eff.samp
units         1        1        1        0

 Location effects: breeding ~ mass 

            post.mean l-95% CI u-95% CI eff.samp  pMCMC    
(Intercept)   -1.5979  -3.6412   0.6322     1000  0.126    
mass           0.6094   0.2202   1.0957     1000 <0.001 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
c2 <- (16 * sqrt(3) / (15 * pi))^2
res_1 <- mod2_mcmcglmm_logit2$Sol / sqrt(1+c2) # for fixed effects
res_2 <- mod2_mcmcglmm_logit2$VCV / (1+c2) # for variance components

summary(res_1)

Iterations = 60001:259801
Thinning interval = 200 
Number of chains = 1 
Sample size per chain = 1000 

1. Empirical mean and standard deviation for each variable,
   plus standard error of the mean:

               Mean     SD Naive SE Time-series SE
(Intercept) -1.3774 0.9619 0.030419       0.030419
mass         0.5253 0.1910 0.006038       0.006038

2. Quantiles for each variable:

               2.5%     25%     50%     75%  97.5%
(Intercept) -3.2469 -2.0173 -1.3249 -0.7434 0.4585
mass         0.1571  0.3912  0.5174  0.6488 0.9140
summary(res_2)

Iterations = 60001:259801
Thinning interval = 200 
Number of chains = 1 
Sample size per chain = 1000 

1. Empirical mean and standard deviation for each variable,
   plus standard error of the mean:

          Mean     SD Naive SE Time-series SE
phylo   0.7442 0.5982  0.01892        0.01918
species 0.3773 0.3251  0.01028        0.01103
units   0.7430 0.0000  0.00000        0.00000

2. Quantiles for each variable:

            2.5%    25%    50%    75% 97.5%
phylo   0.016307 0.3010 0.6173 1.0441 2.215
species 0.001679 0.1132 0.3198 0.5479 1.194
units   0.743029 0.7430 0.7430 0.7430 0.743
summary(mod2_brms_logit2)
 Family: bernoulli 
  Links: mu = logit 
Formula: breeding ~ mass + (1 | gr(phylo, cov = A)) + (1 | species) 
   Data: sim_data2 (Number of observations: 500) 
  Draws: 2 chains, each with iter = 8000; warmup = 6000; thin = 1;
         total post-warmup draws = 4000

Multilevel Hyperparameters:
~phylo (Number of levels: 100) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     0.80      0.35     0.15     1.53 1.00      418      497

~species (Number of levels: 100) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     0.55      0.28     0.03     1.09 1.00      414      766

Regression Coefficients:
          Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept    -1.42      1.01    -3.51     0.45 1.00     2097     1850
mass          0.52      0.19     0.15     0.92 1.00     2664     2631

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
summary(mod2_mcmcglmm_logit3)

 Iterations = 60001:259801
 Thinning interval  = 200
 Sample size  = 1000 

 DIC: 493.3098 

 G-structure:  ~phylo

      post.mean  l-95% CI u-95% CI eff.samp
phylo     1.629 0.0009646    4.062      910

               ~species

        post.mean  l-95% CI u-95% CI eff.samp
species    0.6195 1.258e-06    1.682    756.2

 R-structure:  ~units

      post.mean l-95% CI u-95% CI eff.samp
units         1        1        1        0

 Location effects: breeding ~ mass + sex 

            post.mean l-95% CI u-95% CI eff.samp  pMCMC    
(Intercept)   -0.8882  -3.4396   1.7246     1000  0.462    
mass           0.7312   0.2433   1.2183     1000  0.002 ** 
sexM          -2.0358  -2.6616  -1.4547     1000 <0.001 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
c2 <- (16 * sqrt(3) / (15 * pi))^2
res_1 <- mod2_mcmcglmm_logit3$Sol / sqrt(1+c2) # for fixed effects
res_2 <- mod2_mcmcglmm_logit3$VCV / (1+c2) # for variance components

summary(res_1)

Iterations = 60001:259801
Thinning interval = 200 
Number of chains = 1 
Sample size per chain = 1000 

1. Empirical mean and standard deviation for each variable,
   plus standard error of the mean:

               Mean     SD Naive SE Time-series SE
(Intercept) -0.7656 1.1431 0.036149       0.036149
mass         0.6303 0.2197 0.006946       0.006946
sexM        -1.7548 0.2780 0.008791       0.008791

2. Quantiles for each variable:

               2.5%     25%     50%      75%  97.5%
(Intercept) -2.9652 -1.4957 -0.7345 -0.08661  1.485
mass         0.2142  0.4818  0.6342  0.77977  1.060
sexM        -2.3071 -1.9429 -1.7496 -1.56044 -1.263
summary(res_2)

Iterations = 60001:259801
Thinning interval = 200 
Number of chains = 1 
Sample size per chain = 1000 

1. Empirical mean and standard deviation for each variable,
   plus standard error of the mean:

          Mean     SD Naive SE Time-series SE
phylo   1.2105 0.9233  0.02920        0.03061
species 0.4603 0.3948  0.01249        0.01436
units   0.7430 0.0000  0.00000        0.00000

2. Quantiles for each variable:

            2.5%    25%   50%   75% 97.5%
phylo   0.043912 0.6072 1.031 1.573 3.571
species 0.003595 0.1428 0.374 0.671 1.391
units   0.743029 0.7430 0.743 0.743 0.743
summary(mod2_brms_logit3)
 Family: bernoulli 
  Links: mu = logit 
Formula: breeding ~ mass + sex + (1 | gr(phylo, cov = A)) + (1 | species) 
   Data: sim_data2 (Number of observations: 500) 
  Draws: 2 chains, each with iter = 8000; warmup = 6000; thin = 1;
         total post-warmup draws = 4000

Multilevel Hyperparameters:
~phylo (Number of levels: 100) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     1.01      0.40     0.24     1.86 1.00      776     1072

~species (Number of levels: 100) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     0.60      0.30     0.05     1.18 1.00      670     1537

Regression Coefficients:
          Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept    -0.79      1.14    -3.12     1.38 1.00     3595     3216
mass          0.62      0.22     0.19     1.06 1.00     4898     3060
sexM         -1.75      0.28    -2.29    -1.23 1.00     6333     3267

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

The coefficient estimates for both mass and sex were close to the values we set :)

Ordinal

The response variable is the aggressiveness level, which has three categories: Low, Medium, and High. It is assumed that phylogeny plays a significant role in influencing aggressiveness. However, non-phylogenetic factors are expected to have only a weak relationship with aggressiveness. The relationship between body mass and aggressiveness is unclear, and it may vary across species. Additionally, males are more likely to exhibit higher levels of aggressiveness compared to females.

set.seed(789)
tree3 <- pbtree(n = 100, scale = 1)

# extract the correlation matrix from the phylogenetic tree
phylo_cor <- vcv(tree3, corr = TRUE)

# define parameters
n_species <- 100  # number of species (clusters)
obs_per_species <- 5  # number of observations per species
n <- n_species * obs_per_species  # Total number of observations

# simulate fixed effects 
## continuous variable: body mass
mu_mass <- (log(4)+ log(2500))/2 # mean of whole 100 species body mass
mu_species_mass <- mvrnorm(n = 1, # species level body mass
                           mu = rep(mu_mass, ncol(phylo_cor)),
                           Sigma = phylo_cor*1) 

obs_mass <- sapply(mu_species_mass, 
                   function(x) rnorm(obs_per_species, mean = x, sd = sqrt(0.2)))  %>%  
                   as.vector() # each observation body mass

## binomial variable: sex 
sex <- rbinom(n, 1, 0.5)

# simulate random effect parameter
## variance components
sigma_phylo <- 1 
sigma_non <- 0.4 # standard deviation of non-phylogenetic random effect

## phylogenetic random effect
random_effect_phylo <- mvrnorm(
  n = 1, 
  mu = rep(0, ncol(phylo_cor)), 
  Sigma = sigma_phylo*phylo_cor)
phylo_effect <- rep(random_effect_phylo, each = obs_per_species)

## non-phylogenetic random effect
random_effect_non_phylo <- rnorm(n_species, sd = sqrt(sigma_non))
non_phylo_effect <- rep(random_effect_non_phylo, each = obs_per_species)

# define fixed effect coefficients
beta_o0 <- -0.5  # intercept
beta_o1 <- 0 # coefficient for mass - controversial whether body mass is related to aggressiveness
beta_o2 <- 1 # coefficient for sex - males are often more aggressive than females 

# simulate the linear predictor
eta <- beta_o0 + beta_o1 * obs_mass + beta_o2 * sex + non_phylo_effect + phylo_effect

# define thresholds (cutpoints) for the ordinal categories
cutpoints <- c(-0.5, 0.5)

# Calculate cumulative probabilities using the probit link function
p1 <- pnorm(cutpoints[1] - eta)
p2 <- pnorm(cutpoints[2] - eta) - p1
p3 <- 1 - pnorm(cutpoints[2] - eta)

# combine probabilities into a matrix
probs <- cbind(p1, p2, p3)

# simulate the ordinal response
aggressive_level <- apply(probs, 1, function(prob) sample(1:3, 1, prob = prob))

# create a data frame
sim_data3 <- data.frame(
    individual_id <- rep(1:n, each = 1),
    phylo = names(phylo_effect),
    species = names(phylo_effect),
    aggressive_level = factor(aggressive_level, 
                              levels = 1:3,
                              labels = c("Low", "Medium", "High"),
                              ordered = TRUE),
    sex = factor(sex, labels = c("F", "M")),
    mass = obs_mass    
  ) %>% 
  arrange(species)

table(sim_data3$aggressive_level)

   Low Medium   High 
    97    143    260 

Run models

# mcmcglmm
inv_phylo <- inverseA(tree3, nodes = "ALL", scale = TRUE)

prior1 <- list(R = list(V = 1, fix = 1), 
               G = list(G1 = list(V = 1, nu = 1, alpha.mu = 0, alpha.V = 10),
                        G2 = list(V = 1, nu = 1, alpha.mu = 0, alpha.V = 10)
               )
)

system.time(
  mod3_mcmcglmm <- MCMCglmm(aggressive_level ~ 1,
                            random = ~ phylo + species,
                            ginverse = list(phylo = inv_phylo$Ainv),
                            family = "threshold",
                            data = sim_data3,
                            prior = prior1,
                            nitt = 13000*20, 
                            thin = 10*20, 
                            burnin = 3000*20 
  )
)

system.time(
  mod3_mcmcglmm2 <- MCMCglmm(aggressive_level ~ mass,
                            random = ~ phylo + species,
                            ginverse = list(phylo = inv_phylo$Ainv),
                            family = "threshold",
                            data = sim_data3,
                            prior = prior1,
                            nitt = 13000*20, 
                            thin = 10*20, 
                            burnin = 3000*20 
  )
)

system.time(
  mod3_mcmcglmm3 <- MCMCglmm(aggressive_level ~ mass + sex,
                            random = ~ phylo + species,
                            ginverse = list(phylo = inv_phylo$Ainv),
                            family = "threshold",
                            data = sim_data3,
                            prior = prior1,
                            nitt = 13000*20, 
                            thin = 10*20, 
                            burnin = 3000*20 
  )
)

# brms
A <- ape::vcv.phylo(tree3, corr = TRUE)

default_priors <- default_prior(
                    aggressive_level ~ 1 + (1 | gr(phylo, cov = A)) + (1 | species),
                    data = sim_data3,
                    family = cumulative(link = "probit"),
                    data2 = list(A = A)
                    )

system.time(
 mod3_brms <- brm(
            formula = aggressive_level ~ 1 + (1 | gr(phylo, cov = A)) + (1 | species),
            data = sim_data3,
            family = cumulative(link = "probit"),
            data2 = list(A = A),
            prior = default_priors,
            iter = 9000,
            warmup = 7000,
            thin = 1,
            chain = 2,
            control = list(adapt_delta = 0.95),
                )
            )

default_priors2 <- default_prior(
                    aggressive_level ~ mass + (1 | gr(phylo, cov = A)) + (1 | species),
                    data = sim_data3,
                    family = cumulative(link = "probit"),
                    data2 = list(A = A)
                    )

system.time(
 mod3_brms2 <- brm(
            formula = aggressive_level ~ mass + (1 | gr(phylo, cov = A)) + (1 | species),
            data = sim_data3,
            family = cumulative(link = "probit"),
            data2 = list(A = A),
            prior = default_priors2,
            iter = 9000,
            warmup = 7000,
            thin = 1,
            chain = 2,
            control = list(adapt_delta = 0.95),
                )
            )

default_priors3 <- default_prior(
                    aggressive_level ~ mass + sex + (1 | gr(phylo, cov = A)) + (1 | species),
                    data = sim_data3,
                    family = cumulative(link = "probit"),
                    data2 = list(A = A)
                    )

system.time(
 mod3_brms3 <- brm(
            formula = aggressive_level ~ mass + sex + (1 | gr(phylo, cov = A)) + (1 | species),
            data = sim_data3, 
            family = cumulative(link = "probit"),
            data2 = list(A = A),
            prior = default_priors3,
            iter = 9000,
            warmup = 7000,
            thin = 1,
            chain = 2,
            control = list(adapt_delta = 0.95),
                )
            )

Results

The estimates should align with the following values…

mean(eta)
[1] 0.7347869
var(eta)
[1] 1.484684
var(random_effect_phylo)
[1] 0.8430567
var(random_effect_non_phylo)
[1] 0.3352239
summary(mod3_mcmcglmm)

 Iterations = 60001:259801
 Thinning interval  = 200
 Sample size  = 1000 

 DIC: 864.7065 

 G-structure:  ~phylo

      post.mean l-95% CI u-95% CI eff.samp
phylo     1.117   0.2213    2.167     1000

               ~species

        post.mean  l-95% CI u-95% CI eff.samp
species    0.3178 9.188e-07   0.6704     1099

 R-structure:  ~units

      post.mean l-95% CI u-95% CI eff.samp
units         1        1        1        0

 Location effects: aggressive_level ~ 1 

            post.mean l-95% CI u-95% CI eff.samp  pMCMC    
(Intercept)    1.3987   0.5334   2.1765     1000 <0.001 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

 Cutpoints: 

                                 post.mean l-95% CI u-95% CI eff.samp
cutpoint.traitaggressive_level.1      1.12   0.9581    1.277     1007
posterior_summary(mod3_mcmcglmm$Sol)
            Estimate Est.Error      Q2.5    Q97.5
(Intercept) 1.398716 0.4201698 0.5994656 2.295179
posterior_summary(mod3_mcmcglmm$VCV)
         Estimate Est.Error       Q2.5     Q97.5
phylo   1.1170856 0.5334151 0.36097048 2.3726763
species 0.3178276 0.1906041 0.02409845 0.7331308
units   1.0000000 0.0000000 1.00000000 1.0000000
posterior_summary(mod3_mcmcglmm$CP)
                                 Estimate  Est.Error      Q2.5    Q97.5
cutpoint.traitaggressive_level.1 1.119532 0.08512686 0.9590519 1.278939
summary(mod3_brms)
 Family: cumulative 
  Links: mu = probit; disc = identity 
Formula: aggressive_level ~ 1 + (1 | gr(phylo, cov = A)) + (1 | species) 
   Data: sim_data3 (Number of observations: 500) 
  Draws: 2 chains, each with iter = 9000; warmup = 7000; thin = 1;
         total post-warmup draws = 4000

Multilevel Hyperparameters:
~phylo (Number of levels: 100) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     1.01      0.24     0.58     1.52 1.00      731     1479

~species (Number of levels: 100) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     0.55      0.17     0.19     0.87 1.00      578      657

Regression Coefficients:
             Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept[1]    -1.32      0.42    -2.16    -0.51 1.00     2085     2250
Intercept[2]    -0.20      0.41    -1.03     0.60 1.00     2074     2333

Further Distributional Parameters:
     Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
disc     1.00      0.00     1.00     1.00   NA       NA       NA

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

the estimates are consistent with the true values, and both model outputs are as well. Then, phylogenetic signal is…

# MCMCglmm
phylo_signal_mod3_mcmcglmm <- ((mod3_mcmcglmm$VCV[, "phylo"]) / (mod3_mcmcglmm$VCV[, "phylo"] + mod3_mcmcglmm$VCV[, "species"] + 1))
phylo_signal_mod3_mcmcglmm %>% mean()
[1] 0.438553
phylo_signal_mod3_mcmcglmm %>% quantile(probs = c(0.025,0.5,0.975))
     2.5%       50%     97.5% 
0.1911082 0.4385983 0.6808732 
# brms
phylo_signal_mod3_brms <- mod3_brms %>% as_tibble() %>% 
        dplyr::select(Sigma_phy = sd_phylo__Intercept, Sigma_non_phy = sd_species__Intercept) %>% 
        mutate(lambda_ordinal = Sigma_phy^2 / (Sigma_phy^2 + Sigma_non_phy^2 +1)) %>% 
        pull(lambda_ordinal)

phylo_signal_mod3_brms %>% mean()
[1] 0.4288363
phylo_signal_mod3_brms %>% quantile(probs = c(0.025,0.5,0.975))
     2.5%       50%     97.5% 
0.1788238 0.4319170 0.6689190 

Nice - both models yielded nearly identical phylogenetic signals.

summary(mod3_mcmcglmm2)

 Iterations = 60001:259801
 Thinning interval  = 200
 Sample size  = 1000 

 DIC: 862.4181 

 G-structure:  ~phylo

      post.mean l-95% CI u-95% CI eff.samp
phylo     1.208   0.3157    2.251     1000

               ~species

        post.mean  l-95% CI u-95% CI eff.samp
species    0.2989 2.879e-06   0.6329     1000

 R-structure:  ~units

      post.mean l-95% CI u-95% CI eff.samp
units         1        1        1        0

 Location effects: aggressive_level ~ mass 

            post.mean l-95% CI u-95% CI eff.samp pMCMC   
(Intercept)   2.16331  0.84674  3.45252     1000 0.002 **
mass         -0.16112 -0.36658  0.03244     1000 0.126   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

 Cutpoints: 

                                 post.mean l-95% CI u-95% CI eff.samp
cutpoint.traitaggressive_level.1     1.126   0.9627    1.299     1000
posterior_summary(mod3_mcmcglmm2$Sol)
             Estimate Est.Error       Q2.5      Q97.5
(Intercept)  2.163314 0.6643733  0.8905397 3.49905134
mass        -0.161119 0.1028227 -0.3548954 0.04503046
posterior_summary(mod3_mcmcglmm2$VCV)
         Estimate Est.Error        Q2.5     Q97.5
phylo   1.2078921 0.5324432 0.400101109 2.4861135
species 0.2989292 0.1824707 0.009196968 0.7111517
units   1.0000000 0.0000000 1.000000000 1.0000000
posterior_summary(mod3_mcmcglmm2$CP)
                                 Estimate  Est.Error      Q2.5    Q97.5
cutpoint.traitaggressive_level.1 1.125844 0.08611264 0.9675762 1.303084
summary(mod3_brms2)
 Family: cumulative 
  Links: mu = probit; disc = identity 
Formula: aggressive_level ~ mass + (1 | gr(phylo, cov = A)) + (1 | species) 
   Data: sim_data3 (Number of observations: 500) 
  Draws: 2 chains, each with iter = 9000; warmup = 7000; thin = 1;
         total post-warmup draws = 4000

Multilevel Hyperparameters:
~phylo (Number of levels: 100) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     1.08      0.25     0.63     1.63 1.00      797     1655

~species (Number of levels: 100) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     0.50      0.19     0.08     0.84 1.00      526      647

Regression Coefficients:
             Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept[1]    -2.11      0.66    -3.41    -0.84 1.00     4761     3053
Intercept[2]    -0.99      0.65    -2.28     0.29 1.00     4540     2992
mass            -0.16      0.10    -0.37     0.04 1.00     6574     2840

Further Distributional Parameters:
     Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
disc     1.00      0.00     1.00     1.00   NA       NA       NA

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
summary(mod3_mcmcglmm3)

 Iterations = 60001:259801
 Thinning interval  = 200
 Sample size  = 1000 

 DIC: 785.7335 

 G-structure:  ~phylo

      post.mean l-95% CI u-95% CI eff.samp
phylo     1.664   0.3929    3.216     1000

               ~species

        post.mean l-95% CI u-95% CI eff.samp
species    0.4061 9.11e-06   0.8454     1000

 R-structure:  ~units

      post.mean l-95% CI u-95% CI eff.samp
units         1        1        1        0

 Location effects: aggressive_level ~ mass + sex 

            post.mean l-95% CI u-95% CI eff.samp  pMCMC    
(Intercept)    1.6060   0.1500   3.1995     1000  0.028 *  
mass          -0.1199  -0.3519   0.1025     1000  0.308    
sexM           1.0586   0.7945   1.3036     1117 <0.001 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

 Cutpoints: 

                                 post.mean l-95% CI u-95% CI eff.samp
cutpoint.traitaggressive_level.1     1.277    1.091    1.487     1000
posterior_summary(mod3_mcmcglmm3$Sol)
              Estimate Est.Error        Q2.5     Q97.5
(Intercept)  1.6059858 0.7827238  0.09396171 3.1803784
mass        -0.1198975 0.1181710 -0.35162836 0.1026163
sexM         1.0585665 0.1318889  0.81133692 1.3344116
posterior_summary(mod3_mcmcglmm3$VCV)
        Estimate Est.Error       Q2.5     Q97.5
phylo   1.663718 0.8053196 0.58185142 3.5586348
species 0.406108 0.2440002 0.02279807 0.9607917
units   1.000000 0.0000000 1.00000000 1.0000000
posterior_summary(mod3_mcmcglmm3$CP)
                                 Estimate Est.Error     Q2.5    Q97.5
cutpoint.traitaggressive_level.1 1.277118 0.1012452 1.088106 1.485881
summary(mod3_brms3)
 Family: cumulative 
  Links: mu = probit; disc = identity 
Formula: aggressive_level ~ mass + sex + (1 | gr(phylo, cov = A)) + (1 | species) 
   Data: sim_data3 (Number of observations: 500) 
  Draws: 2 chains, each with iter = 9000; warmup = 7000; thin = 1;
         total post-warmup draws = 4000

Multilevel Hyperparameters:
~phylo (Number of levels: 100) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     1.25      0.30     0.72     1.90 1.00      423     1179

~species (Number of levels: 100) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     0.59      0.21     0.11     0.97 1.00      357      408

Regression Coefficients:
             Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept[1]    -1.58      0.76    -3.08    -0.12 1.00     2467     2888
Intercept[2]    -0.30      0.76    -1.80     1.14 1.00     2420     3074
mass            -0.13      0.11    -0.35     0.09 1.00     3431     3348
sexM             1.05      0.13     0.80     1.31 1.00     5609     3295

Further Distributional Parameters:
     Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
disc     1.00      0.00     1.00     1.00   NA       NA       NA

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

Both MCMCglmm and brms estimated very similar values, and the fixed effect coefficients were close to the values set in the simulation.

Nominal

The response variable is lateralisation (handedness/footedness), categorized as Left, Right, or Both, with “Both” serving as the reference category, representing non-preference. It is assumed that phylogeny is related to lateralisation, with a weak negative correlation between left and right preferences. Non-phylogenetic factors are expected to have only a weak influence on lateralisation. While body mass may have a minor effect, sex would not have significant effect on lateralisation. The majority of individuals tend to show a preference for either the left or right side, with only a few exhibiting no clear preference.

Here, we create a simple dataset and check the results from the intercept-only model, as there are some challenges. The nominal model is more complex, so when we use similar settings to the three models above, the sample size we have set is too small. As a result, the variance and correlation do not align. Therefore, in this case, we do not include the effects of mass or sex when simulating the linear predictor. Additionally, both phylogenetic and non-phylogenetic correlations are set to zero.

set.seed(789)
tree4 <- pbtree(n = 100, scale = 1)

# extract the correlation matrix from the phylogenetic tree
phylo_cor <- vcv(tree4, corr = TRUE)

# define parameters
n_species <- 100  # number of species (clusters)
obs_per_species <- 5  # number of observations per species
n <- n_species * obs_per_species  # total number of observations
n_categories <- 3

# define standard deviations and correlations for the random effects
sigma_phylo <- c(1, 1.5)  # standard deviations for phylogenetic random effects for left and right
cor_phylo <- 0  # correlation between phylogenetic random effects for left and right
sigma_u <- c(1.1, 1)  # standard deviations for non-phylogenetic random effects for left and right
cor_u <- 0  # correlation between non-phylogenetic random effects for left and right

# covariance matrices
cov_u <- diag(sigma_u^2)
cov_u[1, 2] <- cov_u[2, 1] <- cor_u * sigma_u[1] * sigma_u[2]

cov_phylo <- diag(sigma_phylo^2)
cov_phylo[1, 2] <- cov_phylo[2, 1] <- cor_phylo * sigma_phylo[1] * sigma_phylo[2]

# simulate phylogenetic random effects
phylo_random_effects <- mvrnorm(n = 1, mu = rep(0, n_species * 2), Sigma = kronecker(cov_phylo, phylo_cor))
phylo_effect_1 <- rep(phylo_random_effects[1:n_species], each = obs_per_species)
phylo_effect_2 <- rep(phylo_random_effects[(n_species + 1):(2 * n_species)], each = obs_per_species)

# simulate non-phylogenetic random effects
non_phylo_random_effects <- mvrnorm(n_species, mu = c(0, 0), Sigma = cov_u)
non_phylo_effect_1 <- rep(non_phylo_random_effects[, 1], each = obs_per_species)
non_phylo_effect_2 <- rep(non_phylo_random_effects[, 2], each = obs_per_species)

# define fixed effect coefficients for each category relative to the reference category (Both)
beta_m0 <- c(0.4, 0.8) # intercept for left and right 
beta_m1 <- c(0.2, -0.1)   # coefficient for mass for left and right 
beta_m2 <- c(0, 0)   # coefficient for sex for left and right

# simulate the linear predictor for each category relative to the reference
eta <- matrix(0, nrow = n, ncol = 2)
eta[, 1] <- beta_m0[1] + non_phylo_effect_1 + phylo_effect_1
eta[, 2] <- beta_m0[2] + non_phylo_effect_2 + phylo_effect_2

# calculate probabilities using the multinomial logit link function
exp_eta <- exp(cbind(eta, 0))  # adding a column of zeros for the reference category
probs <- exp_eta / rowSums(exp_eta)

# simulate the multinomial response
lateralisation <- apply(probs, 1, function(prob) sample(1:3, 1, prob = prob))

# generate individual ID for each observation
individual_id <- rep(1:n, each = 1)  # create unique individual IDs for each observation

# create a data frame
sim_data4 <- data.frame(
  individual_id = individual_id,
  phylo = rep(tree4$tip.label, each = 5), 
  species = rep(tree4$tip.label, each = 5),
  lateralisation = factor(lateralisation, 
                          levels = 1:n_categories,
                          labels = c("Left", "Right", "Both"),
                          ordered = FALSE),
  sex = factor(sex, labels = c("F", "M")),
  mass = obs_mass 
) %>% 
  arrange(species)

sim_data4$lateralisation <- relevel(sim_data4$lateralisation, ref = "Both")

table(sim_data4$lateralisation)

 Both  Left Right 
   45   208   247 

Run models

inv_phylo <- inverseA(tree4, nodes = "ALL", scale = TRUE)

prior1 <- list(
  R = list(V = (matrix(1, 2, 2) + diag(2)) / 3, fix = 1), 
  G = list(G1 = list(V = diag(2), nu = 2, 
                     alpha.mu = rep(0, 2), alpha.V =  diag(2)
  ),
  G2 = list(V = diag(2), nu = 2, 
            alpha.mu = rep(0, 2), alpha.V =  diag(2)
  )
  )
)

system.time(
  mcmcglmm_mod4 <- MCMCglmm(lateralisation ~ trait - 1, 
                            random = ~us(trait):phylo + us(trait):species,
                            rcov = ~us(trait):units,
                            ginverse = list(phylo = inv_phylo$Ainv),
                            family = "categorical",
                            data = sim_data4,
                            prior = prior1,
                            nitt = 13000*150, 
                            thin = 10*150, 
                            burnin = 3000*150
                            )
)


# brms
A <- ape::vcv.phylo(tree4, corr = TRUE)
priors_brms1 <- default_prior(lateralisation ~ 1 + (1 |a| gr(phylo, cov = A)) + (1 |b| species), 
                            data = sim_data4,
                            data2 = list(A = A),
                            family = categorical(link = "logit")
                            )

system.time(
  brms_mod4 <- brm(lateralisation ~ 1 + (1 |a| gr(phylo, cov = A)) + (1 |b| species),
                data = sim_data4,
                data2 = list(A = A),
                family = categorical(link = "logit"),
                prior = priors_brms1,
                iter = 12000,  
                warmup = 8000,
                chains = 2,
                thin = 1,
                cores = 2,
                control = list(adapt_delta = 0.99),
                )
            )

Results

True values are…

mean(eta[, 1])
[1] 1.498934
mean(eta[, 2])
[1] 1.71958
var(phylo_effect_1)
[1] 0.888621
var(phylo_effect_2)
[1] 1.853071
var(non_phylo_effect_1)
[1] 1.135754
var(non_phylo_effect_2)
[1] 1.184511
cor(phylo_effect_1, phylo_effect_2)
[1] 0.1435537
cor(non_phylo_effect_1, non_phylo_effect_2)
[1] 0.01583177

Before checking the output,we need to correct the results from MCMCglmm using the following equation:

c2 <- (16 * sqrt(3) / (15 * pi))^2
c2a <- c2*(2/3)
summary(mod4_mcmcglmm)

 Iterations = 450001:1948501
 Thinning interval  = 1500
 Sample size  = 1000 

 DIC: 758.4723 

 G-structure:  ~us(trait):phylo

                                                          post.mean   l-95% CI
traitlateralisation.Left:traitlateralisation.Left.phylo      0.5216  2.422e-07
traitlateralisation.Right:traitlateralisation.Left.phylo    -0.4840 -2.272e+00
traitlateralisation.Left:traitlateralisation.Right.phylo    -0.4840 -2.272e+00
traitlateralisation.Right:traitlateralisation.Right.phylo    4.6297  1.388e+00
                                                          u-95% CI eff.samp
traitlateralisation.Left:traitlateralisation.Left.phylo     1.6198     1000
traitlateralisation.Right:traitlateralisation.Left.phylo    0.8416     1000
traitlateralisation.Left:traitlateralisation.Right.phylo    0.8416     1000
traitlateralisation.Right:traitlateralisation.Right.phylo   9.1477     1000

               ~us(trait):species

                                                            post.mean
traitlateralisation.Left:traitlateralisation.Left.species      1.6045
traitlateralisation.Right:traitlateralisation.Left.species    -0.4089
traitlateralisation.Left:traitlateralisation.Right.species    -0.4089
traitlateralisation.Right:traitlateralisation.Right.species    1.0943
                                                              l-95% CI u-95% CI
traitlateralisation.Left:traitlateralisation.Left.species    3.293e-01   3.3113
traitlateralisation.Right:traitlateralisation.Left.species  -1.402e+00   0.5503
traitlateralisation.Left:traitlateralisation.Right.species  -1.402e+00   0.5503
traitlateralisation.Right:traitlateralisation.Right.species  1.113e-06   2.6165
                                                            eff.samp
traitlateralisation.Left:traitlateralisation.Left.species       1000
traitlateralisation.Right:traitlateralisation.Left.species      1000
traitlateralisation.Left:traitlateralisation.Right.species      1000
traitlateralisation.Right:traitlateralisation.Right.species     1294

 R-structure:  ~us(trait):units

                                                          post.mean l-95% CI
traitlateralisation.Left:traitlateralisation.Left.units      0.6667   0.6667
traitlateralisation.Right:traitlateralisation.Left.units     0.3333   0.3333
traitlateralisation.Left:traitlateralisation.Right.units     0.3333   0.3333
traitlateralisation.Right:traitlateralisation.Right.units    0.6667   0.6667
                                                          u-95% CI eff.samp
traitlateralisation.Left:traitlateralisation.Left.units     0.6667        0
traitlateralisation.Right:traitlateralisation.Left.units    0.3333        0
traitlateralisation.Left:traitlateralisation.Right.units    0.3333        0
traitlateralisation.Right:traitlateralisation.Right.units   0.6667        0

 Location effects: lateralisation ~ trait - 1 

                          post.mean l-95% CI u-95% CI eff.samp pMCMC  
traitlateralisation.Left    0.70100 -0.04992  1.49781     1000 0.086 .
traitlateralisation.Right   0.48833 -1.14012  2.22379     1000 0.578  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
res_1 <- mod4_mcmcglmm$Sol / sqrt(1+c2a) # for fixed effects
res_2 <- mod4_mcmcglmm$VCV / (1+c2a) # for variance components
res_3_corr_phylo <- (mod4_mcmcglmm$VCV[, 2]/(1+c2a)) /sqrt((mod4_mcmcglmm$VCV[, 1] * mod4_mcmcglmm$VCV[, 4])/(1+c2a))
res_3_corr_nonphylo <- (mod4_mcmcglmm$VCV[, 6]/(1+c2a)) /sqrt((mod4_mcmcglmm$VCV[, 5] * mod4_mcmcglmm$VCV[, 8])/(1+c2a))

summary(res_1)

Iterations = 450001:1948501
Thinning interval = 1500 
Number of chains = 1 
Sample size per chain = 1000 

1. Empirical mean and standard deviation for each variable,
   plus standard error of the mean:

                            Mean     SD Naive SE Time-series SE
traitlateralisation.Left  0.6319 0.3392  0.01073        0.01073
traitlateralisation.Right 0.4402 0.7946  0.02513        0.02513

2. Quantiles for each variable:

                             2.5%      25%    50%    75% 97.5%
traitlateralisation.Left  -0.1209  0.44513 0.6354 0.8306 1.306
traitlateralisation.Right -1.0131 -0.09662 0.4496 0.9362 2.079
summary(res_2)

Iterations = 450001:1948501
Thinning interval = 1500 
Number of chains = 1 
Sample size per chain = 1000 

1. Empirical mean and standard deviation for each variable,
   plus standard error of the mean:

                                                               Mean     SD
traitlateralisation.Left:traitlateralisation.Left.phylo      0.4239 0.4635
traitlateralisation.Right:traitlateralisation.Left.phylo    -0.3933 0.6352
traitlateralisation.Left:traitlateralisation.Right.phylo    -0.3933 0.6352
traitlateralisation.Right:traitlateralisation.Right.phylo    3.7622 1.7766
traitlateralisation.Left:traitlateralisation.Left.species    1.3038 0.6761
traitlateralisation.Right:traitlateralisation.Left.species  -0.3323 0.3875
traitlateralisation.Left:traitlateralisation.Right.species  -0.3323 0.3875
traitlateralisation.Right:traitlateralisation.Right.species  0.8892 0.7171
traitlateralisation.Left:traitlateralisation.Left.units      0.5418 0.0000
traitlateralisation.Right:traitlateralisation.Left.units     0.2709 0.0000
traitlateralisation.Left:traitlateralisation.Right.units     0.2709 0.0000
traitlateralisation.Right:traitlateralisation.Right.units    0.5418 0.0000
                                                            Naive SE
traitlateralisation.Left:traitlateralisation.Left.phylo      0.01466
traitlateralisation.Right:traitlateralisation.Left.phylo     0.02009
traitlateralisation.Left:traitlateralisation.Right.phylo     0.02009
traitlateralisation.Right:traitlateralisation.Right.phylo    0.05618
traitlateralisation.Left:traitlateralisation.Left.species    0.02138
traitlateralisation.Right:traitlateralisation.Left.species   0.01226
traitlateralisation.Left:traitlateralisation.Right.species   0.01226
traitlateralisation.Right:traitlateralisation.Right.species  0.02268
traitlateralisation.Left:traitlateralisation.Left.units      0.00000
traitlateralisation.Right:traitlateralisation.Left.units     0.00000
traitlateralisation.Left:traitlateralisation.Right.units     0.00000
traitlateralisation.Right:traitlateralisation.Right.units    0.00000
                                                            Time-series SE
traitlateralisation.Left:traitlateralisation.Left.phylo            0.01466
traitlateralisation.Right:traitlateralisation.Left.phylo           0.02009
traitlateralisation.Left:traitlateralisation.Right.phylo           0.02009
traitlateralisation.Right:traitlateralisation.Right.phylo          0.05618
traitlateralisation.Left:traitlateralisation.Left.species          0.02138
traitlateralisation.Right:traitlateralisation.Left.species         0.01226
traitlateralisation.Left:traitlateralisation.Right.species         0.01226
traitlateralisation.Right:traitlateralisation.Right.species        0.01993
traitlateralisation.Left:traitlateralisation.Left.units            0.00000
traitlateralisation.Right:traitlateralisation.Left.units           0.00000
traitlateralisation.Left:traitlateralisation.Right.units           0.00000
traitlateralisation.Right:traitlateralisation.Right.units          0.00000

2. Quantiles for each variable:

                                                                  2.5%      25%
traitlateralisation.Left:traitlateralisation.Left.phylo      0.0009474  0.09136
traitlateralisation.Right:traitlateralisation.Left.phylo    -1.8018394 -0.71723
traitlateralisation.Left:traitlateralisation.Right.phylo    -1.8018394 -0.71723
traitlateralisation.Right:traitlateralisation.Right.phylo    1.3864388  2.45513
traitlateralisation.Left:traitlateralisation.Left.species    0.3061715  0.82577
traitlateralisation.Right:traitlateralisation.Left.species  -1.1075741 -0.56624
traitlateralisation.Left:traitlateralisation.Right.species  -1.1075741 -0.56624
traitlateralisation.Right:traitlateralisation.Right.species  0.0251312  0.33532
traitlateralisation.Left:traitlateralisation.Left.units      0.5417579  0.54176
traitlateralisation.Right:traitlateralisation.Left.units     0.2708789  0.27088
traitlateralisation.Left:traitlateralisation.Right.units     0.2708789  0.27088
traitlateralisation.Right:traitlateralisation.Right.units    0.5417579  0.54176
                                                                50%       75%
traitlateralisation.Left:traitlateralisation.Left.phylo      0.2797  0.606425
traitlateralisation.Right:traitlateralisation.Left.phylo    -0.3356 -0.003957
traitlateralisation.Left:traitlateralisation.Right.phylo    -0.3356 -0.003957
traitlateralisation.Right:traitlateralisation.Right.phylo    3.4326  4.670559
traitlateralisation.Left:traitlateralisation.Left.species    1.1969  1.673444
traitlateralisation.Right:traitlateralisation.Left.species  -0.3359 -0.104277
traitlateralisation.Left:traitlateralisation.Right.species  -0.3359 -0.104277
traitlateralisation.Right:traitlateralisation.Right.species  0.7494  1.269344
traitlateralisation.Left:traitlateralisation.Left.units      0.5418  0.541758
traitlateralisation.Right:traitlateralisation.Left.units     0.2709  0.270879
traitlateralisation.Left:traitlateralisation.Right.units     0.2709  0.270879
traitlateralisation.Right:traitlateralisation.Right.units    0.5418  0.541758
                                                             97.5%
traitlateralisation.Left:traitlateralisation.Left.phylo     1.6498
traitlateralisation.Right:traitlateralisation.Left.phylo    0.8138
traitlateralisation.Left:traitlateralisation.Right.phylo    0.8138
traitlateralisation.Right:traitlateralisation.Right.phylo   8.0450
traitlateralisation.Left:traitlateralisation.Left.species   2.8234
traitlateralisation.Right:traitlateralisation.Left.species  0.4842
traitlateralisation.Left:traitlateralisation.Right.species  0.4842
traitlateralisation.Right:traitlateralisation.Right.species 2.6216
traitlateralisation.Left:traitlateralisation.Left.units     0.5418
traitlateralisation.Right:traitlateralisation.Left.units    0.2709
traitlateralisation.Left:traitlateralisation.Right.units    0.2709
traitlateralisation.Right:traitlateralisation.Right.units   0.5418
summary(res_3_corr_phylo)

Iterations = 450001:1948501
Thinning interval = 1500 
Number of chains = 1 
Sample size per chain = 1000 

1. Empirical mean and standard deviation for each variable,
   plus standard error of the mean:

          Mean             SD       Naive SE Time-series SE 
      -0.32095        0.43943        0.01390        0.01465 

2. Quantiles for each variable:

    2.5%      25%      50%      75%    97.5% 
-0.86925 -0.68977 -0.42416 -0.02273  0.67656 
summary(res_3_corr_nonphylo)

Iterations = 450001:1948501
Thinning interval = 1500 
Number of chains = 1 
Sample size per chain = 1000 

1. Empirical mean and standard deviation for each variable,
   plus standard error of the mean:

          Mean             SD       Naive SE Time-series SE 
      -0.35545        0.34670        0.01096        0.01096 

2. Quantiles for each variable:

   2.5%     25%     50%     75%   97.5% 
-0.8562 -0.6255 -0.4143 -0.1322  0.4288 
summary(mod4_brms)
 Family: categorical 
  Links: muLeft = logit; muRight = logit 
Formula: lateralisation ~ 1 + (1 | a | gr(phylo, cov = A)) + (1 | b | species) 
   Data: sim_data4 (Number of observations: 500) 
  Draws: 2 chains, each with iter = 12000; warmup = 8000; thin = 1;
         total post-warmup draws = 8000

Multilevel Hyperparameters:
~phylo (Number of levels: 100) 
                                        Estimate Est.Error l-95% CI u-95% CI
sd(muLeft_Intercept)                        0.60      0.37     0.03     1.41
sd(muRight_Intercept)                       1.99      0.46     1.21     3.02
cor(muLeft_Intercept,muRight_Intercept)    -0.28      0.46    -0.96     0.73
                                        Rhat Bulk_ESS Tail_ESS
sd(muLeft_Intercept)                    1.00     1258     2496
sd(muRight_Intercept)                   1.00     2445     4477
cor(muLeft_Intercept,muRight_Intercept) 1.00      480      746

~species (Number of levels: 100) 
                                        Estimate Est.Error l-95% CI u-95% CI
sd(muLeft_Intercept)                        1.16      0.31     0.56     1.79
sd(muRight_Intercept)                       0.94      0.41     0.14     1.76
cor(muLeft_Intercept,muRight_Intercept)    -0.35      0.39    -0.97     0.45
                                        Rhat Bulk_ESS Tail_ESS
sd(muLeft_Intercept)                    1.00     2264     2590
sd(muRight_Intercept)                   1.00     1100     1616
cor(muLeft_Intercept,muRight_Intercept) 1.00     1699     2808

Regression Coefficients:
                  Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
muLeft_Intercept      0.63      0.35    -0.11     1.32 1.00     3793     4140
muRight_Intercept     0.37      0.79    -1.18     1.89 1.00     3953     4832

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

Please look at the correlation part - estimating random effect covariance is harder than estimating the random effects variance(s) - you now understand there is a different difficulty in estimating parameters. The 95% CIs are a quite large, they have very large uncertainties and with such large uncertainties, we do not expect they match…

fixed effects (beta) < random effects variance(s) <<< random effects covariance (correlation)

And phylogenetic signal is…

# average phylogenteic signal - both vs. left
## MCMCglmm
phylo_signalL_mod4_mcmcglmm <- ((mod4_mcmcglmm$VCV[, "traitlateralisation.Left:traitlateralisation.Left.phylo"]/(1+c2a)) / (mod4_mcmcglmm$VCV[, "traitlateralisation.Left:traitlateralisation.Left.phylo"]/(1+c2a) + mod4_mcmcglmm$VCV[, "traitlateralisation.Left:traitlateralisation.Left.species"]/(1+c2a) + 1))
phylo_signalL_mod4_mcmcglmm %>% mean()
[1] 0.1452383
phylo_signalL_mod4_mcmcglmm %>% quantile(probs = c(0.025, 0.5, 0.975))
       2.5%         50%       97.5% 
0.000305831 0.112652234 0.482842085 
## brms
phylo_signalL_mod4_brms <- mod4_brms %>% as_tibble() %>% 
        dplyr::select(Sigma_phy = sd_phylo__muLeft_Intercept, Sigma_non_phy = sd_species__muLeft_Intercept) %>% 
        mutate(lambda_nominalL = Sigma_phy^2 / (Sigma_phy^2 + Sigma_non_phy^2 + 1)) %>% 
        pull(lambda_nominalL)

phylo_signalL_mod4_brms %>% mean()
[1] 0.1557183
phylo_signalL_mod4_brms %>% quantile(probs = c(0.025,0.5,0.975))
        2.5%          50%        97.5% 
0.0004165978 0.1182931266 0.5124994918 
# average phylogenteic signal - both vs. right
## MCMCglmm
phylo_signalR_mod4_mcmcglmm <- ((mod4_mcmcglmm$VCV[, "traitlateralisation.Right:traitlateralisation.Right.phylo"]/(1+c2a)) / (mod4_mcmcglmm$VCV[, "traitlateralisation.Right:traitlateralisation.Right.phylo"]/(1+c2a) + mod4_mcmcglmm$VCV[, "traitlateralisation.Right:traitlateralisation.Right.species"]/(1+c2a) + 1))
phylo_signalR_mod4_mcmcglmm %>% mean()
[1] 0.6446448
phylo_signalR_mod4_mcmcglmm %>% quantile(probs = c(0.025,0.5,0.975))
     2.5%       50%     97.5% 
0.3433895 0.6569055 0.8555424 
## brms
phylo_signalR_mod4_brms <- mod4_brms %>% as_tibble() %>% 
        dplyr::select(Sigma_phy = sd_phylo__muRight_Intercept, Sigma_non_phy = sd_species__muRight_Intercept) %>% 
        mutate(lambda_nominalR = Sigma_phy^2 / (Sigma_phy^2 + Sigma_non_phy^2 + 1)) %>% 
        pull(lambda_nominalR)

phylo_signalR_mod4_brms %>% mean()
[1] 0.6524086
phylo_signalR_mod4_brms %>% quantile(probs = c(0.025,0.5,0.975))
     2.5%       50%     97.5% 
0.3728231 0.6642912 0.8648875