r - How to represent a categorical predictor rstan? -
what proper way format categorical predictor use in stan? cannot seem input categorical predictor normal factor variable, quickest way transform normal categorical variable such stan can accept it?
for example, had a continue predictor , categorical predictor these:
income country 1 62085.59 england 2 60806.33 england 3 60527.27 england 4 67112.64 usa 5 57675.92 usa 6 58128.44 usa 7 60822.47 south africa 8 55805.80 south africa 9 63982.99 south africa 10 64555.45 belgium
how prepare entered in rstan?
it correct stan inputs real or integeger variables. in case, want convert categorical predictor dummy variables (perhaps excluding reference category). in r, can like
dummy_variables <- model.matrix(~ country, data = your_dataset)
however, might not come out right number of observations if have unmodeled missingness on other variables. approach can taken step farther inputting entire model formula like
x <- model.matrix(outcome ~ predictor1 + predictor2 ..., data = your_dataset)
now, have entire design matrix of predictors can use in .stan program linear algebra, such as
data { int<lower=1> n; int<lower=1> k; matrix[n,k] x; vector[n] y; } parameters { vector[k] beta; real<lower=0> sigma; } model { y ~ normal(x * beta, sigma); // likelihood // priors }
utilizing design matrix recommended because makes .stan program reusable different variations of same model or different datasets.
Comments
Post a Comment