This is a generic summary function that describes the output of the function hdpGLM
# S3 method for hdpGLM
summary(object, ...)
an object of the class hdpGLM
generted by the function hdpGLM
Additional arguments accepted are:
true.beta
: a data.frame
with the true values of the linear coefficients beta
if they are known. The data.frame
must contain a column named j
with the index of the context associated with that particular linear coefficient beta
. It must match the indexes used in the data set for each context. Another column named k
must be provided, indicating the cluster of beta
, and a column named Parameter
with the name of the linear coefficients (beta1
, beta2
, ..., beta_dx
, where dx
is the number of covariates at the individual level, and beta1 is the coefficient of the intercept term). It must contain a column named True
with the true value of the betas
. Finally, the data.frame
must contain columns with the context-level covariates as used in the estimation of the hdpGLM function (see Details below).
true.tau
: a data.frame
with four columns. The first must be named w
and it indicates the index of each context-level covariate, starting with 0 for the intercept term. The second column named beta
must contain the indexes of the betas of individual-level covariates, starting with 0 for the intercept term. The third column named Parameter
must be named tau<w><beta>
, where w
and beta
must be the actual values displayed in the columns w
and beta
. Finally, it must have a column named True
with the true value of the parameter.
The function returns a list with two data.frames. The first summarizes the posterior distribution of the linear coefficients beta
. The mean, median, and the 95% HPD interval are provided. The second data.frame contains the summary of the posterior distribution of the parameter tau
.
The function hdpGLM returns a list with the samples from the posterior distribution along with other elements. That list contains an element named context.cov
that connects the indexed "C" created during the estimation and the context-level covariates. So each unique context-level covariate gets an index during the estimation. The algorithm only requires the context-level covariates, but it creates such index C to help the estimation. If true.beta is provided, it must contain indexes for the context as well, which indicates the context of each specific linear coefficient beta
. Such index will probably be different from the one created by the algorithm. Therefore, when the true.beta
is provided, we need to connect the context index C generated by the algorithm and the column j in the true.beta data.frame in order to compare the true values and the estimated value for each context. That is why we need the values of the context-level covariates as well. The summary uses them as key to merge the true and the estimated values for each context. The true and estimated clusters are matched based on the shortest distance between the estimated posterior average and the true value in each context because the labels of the clusters in the estimation can vary, even thought the same data points are classified in the same clusters.