Skip to contents

A function to draw data from the normal block model (see details). The function return both the generated data and the corrresponding model parameters, in a list.

Usage

generate_normal_block_data(
  n = 100,
  p = 40,
  d = 1,
  Q = 3,
  kappa = 0,
  omega_structure = "erdos-renyi",
  u_v = c(0.3, 0.1),
  SNR = 0.75,
  alpha = rep(1/Q, Q),
  range_X = c(0, 10),
  range_D = c(0.5, 1.5)
)

Arguments

n

number of individuals. Default to 100.

p

number of variables. Default to 40.

d

number of covariates. Default to 1.

Q

number of groups. Default to 3.

kappa

vector (or scalar) of variable-wise probability of zero inflation. Default to 0.

omega_structure

the structure of the graph on which the precision matrix between groups is built. Can be a symmetric matrix with Q rows/columns or a character picked in "erdos-renyi", "preferential_attachment", "community" in which case a graph is drawn with sensible generation parameters. See generate_precision_matrix for details.

u_v

two-size vector of positive numbers u and v controlling the generation of the precision matrix Omega: u is the off-diagonal elements of the precision matrix, controlling the magnitude of partial correlations with v a positive number being added to the diagonal elements of the precision matrix. The default value is c(0.3, 0.1).

SNR

Signal to noise ratio: magnitude of the regression parameters B will be adjusted so that tr(var(XB)) and tr(Sigma) match the desried SNR.

alpha

the Q-size vector of group proportion. Default to rep(1/Q, Q)

range_X

A 2-size vector defining the range of the uniform distribution used to draw values in X, the regressor matrix. Default is c(0, 10)

range_D

A 2-size vector defining the range of the uniform distribution used to draw values in D, the diagonal matrix of variances of variables. Default is c(0.5, 1.5)

Value

A named list with the following element - Y a matrix of responses - X a regressor/design matrix - a list of model parameters, encompassing - B: matrix of regression coefficients - C: matrix of group membership - D: diagonal matrix of variance of the variables - Omega: precision matrix of the groups - Sigma: covariance matrix of the groups - kappa: vector of ZI inflation proabilities (one per variable)