Hi everyone,

I am working on a project where I have individuals whom I track in relation to "before", "during" and "after" they take a certain action (i.e., **phas**e). The **year** when they take this action is also recorded. This means that the categories of **phase** have the following meaning: "before" means "the year before they took the action", "during" means the "the year when they took the action" and "after" means "the year after they took the action".

In my modelling, **phase** and **year **are predictor variables and the response variable is a count of votes cast by these individuals throughout the year in questions out of a total number of votes. The model includes a random effect for individual.

In a first stage, I fit 4 different Bayesian models to the data using just **phase** as a predictor of the vote count (out of the total); each of these models uses the same model formula but a different family: binomial, beta-binomial, zero-inflated binomial and zero-inflated beta-binomial. All models are fitted with the* brm()* function from the *brms* package of R, using default priors.

**Question 1: ** From a Bayesian perspective, is it appropriate to compute posterior model weights for these 4 models given they use different families?

**Question 2:** Given that I am interested in characterizing the effect of phase (hence not in predicting from the model), what type of model weights would be most appropriate to use? (I have used the *brms* function *post_prob() *to compute posterior model probabilities from marginal likelihoods, though I read that these probabilities are sensitive to the choice of priors; by default, this function assumes the models are equally likely *a priori*.)

**Question 3: ** If one of the models receives most weight (e.g., its weight is something like 0.9), does it still make sense to average all the model or is it ok to retain just this dominating model for further inference?

In a second stage, I fit 4 * 3 = 12 different Bayesian models to the data, consisting of 3 sets of models. The first set of 4 models uses **year** on its own as a predictor and all 4 families listed above. The second set of 4 models uses **year** and **phase** as predictors, but not their interaction, with each family in turn. The third set of 4 models uses **year**, **phase** and their interaction **year:phase** as predictors, with each family in turn. All 12 models are fitted with the* brm()* function from the *brms* package of R, using default priors. The questions below mirror the questions above, except that now there is the added complication of not just the families possibly changing across candidate models, but also predictors included in the model.

**Question 4: ** From a Bayesian perspective, is it appropriate to compute posterior model weights for these 12 models given they use different families? (Again, here I used *post_prob()* from *brms*.)

**Question 5: ** If one of the models receives most weight (e.g., its weight is something like 0.9), does it still make sense to average all the model or is it ok to retain just this dominating model for further inference?

Any comments or answers would be appreciated - I would like to make sure I am not doing something totally nonsensical.

Many thanks.

Isabella

Email: isabella@ghement.ca

------------------------------

[Isabella] [Ghement][Ghement Statistical Consulting Company Ltd.]

------------------------------