This is part III of the quantile regression series. Part I and part II show the Bayesian quantile regression using the asymmetric Laplace distribution, augmented scheme, and score. In this post I’m going to show the multiple quantile regression given in Wu and Narisetty (2021), which is a follow up to part II.
Multiple quantile regression with score
The score method includes the regressors into the objective function as \[ s_\tau(\beta) = \sum_{i=1}^{n} x_i \psi_\tau(y_i - x_i^{\top}\beta). \] where \(\psi_\tau(u) = \tau - I(u < 0)\).
Wu and Narisetty (2021) consider the following “working” likelihood \[ \mathcal{L}(y \mid x,\, \beta) = C \exp\bigg(-\frac{1}{2n} s_\tau(\beta)^{\top} W s_\tau(\beta) \bigg), \] where \(W\) is a \(p \times p\) positive definite weight matrix, and C is a constant free of \(\beta\). The quadratic form and the given objective function implies that the quantile estimator is maximized at the typical quantile estimate.
The \(W\) matrix for multiple quantiles is given as
\[ W = (Q \otimes G )^{-1}, \text{ where } Q = (\min(\tau_i, \tau_j) - \tau_i \tau_j)_{ij}, \; G = \frac{1}{n}\sum_{i=1}^n x_i x_i^{\top}. \]
A nice property of \(W\) is that it depends all on user input data so can be constructed outside of the MCMC iterations.
Score function
The following Stan model implements the multiple \(\tau\) code.
Show the code
functions{
matrix kronecker(matrix A, matrix B) {
matrix[rows(A) * rows(B), cols(A) * cols(B)] C;
int m = rows(A);
int n = cols(A);
int p = rows(B);
int q = cols(B);
for (i in 1:m) {
for (j in 1:n) {
int row_start = (i - 1) * p + 1;
int row_end = (i - 1) * p + p;
int col_start = (j - 1) * q + 1;
int col_end = (j - 1) * q + q;
C[row_start:row_end, col_start:col_end] = A[i, j] * B;
}
}return C;
}
vector q_loss(real q, vector u){
return (abs(u) + (2 * q - 1) * u);
}
vector score(vector q, vector y, matrix x, array[] vector beta) {
int N = num_elements(y);
int P = num_elements(beta[ , 1]);
int K = num_elements(q);
matrix[K, P] out;
for (k in 1:K) {
out[k] = transpose(transpose(x) * q_loss(q[k], y - x * to_vector(beta[ ,k])));
}
return to_vector(out);
}
matrix make_w (matrix x, vector q) {
int N = rows(x);
int P = cols(x);
int m = num_elements(q);
matrix[m * P, m * P] out;
matrix[m, m] Q;
matrix[P, P] G = crossprod(x) ;
// G[1:P, 1] /= log(N);
// G[1, 2:P] = transpose(G[2:P, 1]);
for (i in 1:m) {
1 - q[i]);
Q[i, i] = q[i] * (for (j in 1:i - 1) {
Q[i, j] = min([q[i], q[j]]) - q[i] * q[j];
Q[j, i] = Q[i, j];
}
}
return kronecker(N * inverse(G), inverse(Q));
}
}data {
int N; // Number of observation
int P; // Number of predictors
int K; // Number of quantiles
vector[K] q;
vector[N] y;
matrix[N, P] x;
}transformed data {
matrix[K * P, K * P] W = make_w(x, q);
}parameters {
// can add ordered constraint here
array[P] vector[K] beta;
real<lower=0> sigma;
}model {
vector[K * P] score_vec = score(q, y, x, beta) / sigma;
for (i in 1:K)
0, 4);
beta[, i] ~ normal(
1);
sigma ~ exponential(
target += -quad_form(W, score_vec) / (2 * N * K) - K * N * log(sigma);
}
I’ll simulate the same data as in part II:
Show the code
set.seed(12312)
library(quantreg)
<- 1000
N <- runif(N, max=10)
x <- -1
alpha <- 2
beta <- alpha + beta * x + rnorm(N, sd = .5 * x)
y <- c(0.05, 0.5, 0.95)
q
# frequentist estimate
<- quantreg::rq(y ~ x, tau = q)
out_freq
<- multiple_score_qr$sample(
out_score data = list(N = N,
P = 2,
K = length(q),
q = q,
y = y,
x = as.matrix(data.frame(alpha = 1, x = x))),
seed = 12123123,
parallel_chains = 4,
iter_warmup = 500,
iter_sampling = 500,
refresh = 0,
show_messages = FALSE
)
Running MCMC with 4 parallel chains...
Chain 2 finished in 1.8 seconds.
Chain 1 finished in 2.0 seconds.
Chain 3 finished in 1.9 seconds.
Chain 4 finished in 2.1 seconds.
All 4 chains finished successfully.
Mean chain execution time: 2.0 seconds.
Total execution time: 2.3 seconds.
Show the code
$summary() out_score
# A tibble: 8 × 10
variable mean median sd mad q5 q95 rhat ess_bulk
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 lp__ -3083. -3083. 1.81 1.73 -3087. -3081. 1.00 746.
2 beta[1,1] -1.40 -1.39 0.274 0.286 -1.85 -0.969 1.00 872.
3 beta[2,1] 1.30 1.30 0.0430 0.0437 1.23 1.37 1.00 905.
4 beta[1,2] -0.867 -0.864 0.124 0.124 -1.07 -0.667 1.00 971.
5 beta[2,2] 1.97 1.97 0.0206 0.0196 1.94 2.00 1.00 1000.
6 beta[1,3] -0.905 -0.936 0.218 0.207 -1.21 -0.503 1.00 971.
7 beta[2,3] 2.83 2.84 0.0456 0.0447 2.75 2.90 1.01 944.
8 sigma 1.69 1.69 0.0229 0.0239 1.66 1.73 1.00 1558.
# … with 1 more variable: ess_tail <dbl>
Test on ImmunogG data
Show the code
library(Brq)
data("ImmunogG")
<- data.frame(y = ImmunogG$IgG,
dat alpha = 1,
x = ImmunogG$Age,
xsq = ImmunogG$Age^2)
<- multiple_score_qr$sample(
out_score_mod data = list(N = nrow(ImmunogG),
P = 3,
K = 3,
q = c(0.05, 0.5, 0.95),
y = ImmunogG$IgG,
x = as.matrix(dat[, 2:4])),
seed = 12123123,
parallel_chains = 4,
max_treedepth = 12,
iter_warmup = 500,
iter_sampling = 500,
refresh = 0,
show_messages = FALSE
)
Running MCMC with 4 parallel chains...
Chain 1 finished in 5.8 seconds.
Chain 2 finished in 6.3 seconds.
Chain 4 finished in 6.9 seconds.
Chain 3 finished in 7.1 seconds.
All 4 chains finished successfully.
Mean chain execution time: 6.5 seconds.
Total execution time: 7.2 seconds.
Show the code
<- rq(y ~ x + xsq,
out_freq_1 data = dat,
tau = 0.05)
<- rq(y ~ x + xsq,
out_freq_2 data = dat,
tau = 0.5)
<- rq(y ~ x + xsq,
out_freq_3 data = dat,
tau = 0.95)
Results
Modified score:
Show the code
$summary() out_score_mod
# A tibble: 11 × 10
variable mean median sd mad q5 q95 rhat ess_bulk
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 lp__ -605. -605. 2.25 2.11 -609. -602. 1.00 749.
2 beta[1,1] 0.648 0.682 0.443 0.441 -0.116 1.33 1.00 770.
3 beta[2,1] 1.20 1.20 0.387 0.376 0.545 1.85 1.01 695.
4 beta[3,1] -0.138 -0.139 0.0648 0.0620 -0.247 -0.0296 1.01 727.
5 beta[1,2] 2.69 2.68 0.316 0.319 2.18 3.22 1.01 677.
6 beta[2,2] 1.20 1.21 0.238 0.235 0.810 1.58 1.01 652.
7 beta[3,2] -0.0773 -0.0784 0.0374 0.0374 -0.137 -0.0137 1.01 703.
8 beta[1,3] 7.12 7.12 0.609 0.620 6.16 8.18 1.00 791.
9 beta[2,3] -0.385 -0.393 0.494 0.505 -1.19 0.416 1.01 707.
10 beta[3,3] 0.237 0.239 0.0788 0.0785 0.105 0.366 1.01 705.
11 sigma 1.19 1.19 0.0270 0.0278 1.15 1.23 1.00 1373.
# … with 1 more variable: ess_tail <dbl>
Quantreg
Quantreg:
Show the code
summary(out_freq_1, se = "iid")
Call: rq(formula = y ~ x + xsq, tau = 0.05, data = dat)
tau: [1] 0.05
Coefficients:
Value Std. Error t value Pr(>|t|)
(Intercept) 0.76514 0.30477 2.51058 0.01259
x 1.19143 0.24792 4.80577 0.00000
xsq -0.13371 0.04037 -3.31187 0.00104
Show the code
summary(out_freq_2, se = "iid")
Call: rq(formula = y ~ x + xsq, tau = 0.5, data = dat)
tau: [1] 0.5
Coefficients:
Value Std. Error t value Pr(>|t|)
(Intercept) 2.80112 0.59354 4.71936 0.00000
x 1.15865 0.48282 2.39976 0.01703
xsq -0.07523 0.07863 -0.95677 0.33947
Show the code
summary(out_freq_3, se = "iid")
Call: rq(formula = y ~ x + xsq, tau = 0.95, data = dat)
tau: [1] 0.95
Coefficients:
Value Std. Error t value Pr(>|t|)
(Intercept) 7.23563 0.67142 10.77659 0.00000
x -0.48165 0.54617 -0.88186 0.37857
xsq 0.24602 0.08895 2.76597 0.00603