For each gene in sca, fits the hurdle model in formula (linear for et>0), logistic for et==0 vs et>0. Return an object of class ZlmFit containing slots giving the coefficients, variance-covariance matrices, etc. After each gene, optionally run the function on the fit named by 'hook'

zlm(
  formula,
  sca,
  method = "bayesglm",
  silent = TRUE,
  ebayes = TRUE,
  ebayesControl = NULL,
  force = FALSE,
  hook = NULL,
  parallel = TRUE,
  LMlike,
  onlyCoef = FALSE,
  exprs_values = assay_idx(sca)$aidx,
  ...
)

Arguments

formula

a formula with the measurement variable on the LHS and predictors present in colData on the RHS

sca

SingleCellAssay object

method

character vector, either 'glm', 'glmer' or 'bayesglm'

silent

Silence common problems with fitting some genes

ebayes

if TRUE, regularize variance using empirical bayes method

ebayesControl

list with parameters for empirical bayes procedure. See ebayes.

force

Should we continue testing genes even after many errors have occurred?

hook

a function called on the fit after each gene.

parallel

If TRUE and option(mc.cores)>1 then multiple cores will be used in fitting.

LMlike

if provided, then the model defined in this object will be used, rather than following the formulas. This is intended for internal use.

onlyCoef

If TRUE then only an array of model coefficients will be returned (probably only useful for bootstrapping).

exprs_values

character or integer passed to `assay` specifying which assay to use for testing

...

arguments passed to the S4 model object upon construction. For example, fitArgsC and fitArgsD, or coefPrior.

Value

a object of class ZlmFit with methods to extract coefficients, etc. OR, if data is a data.frame just a list of the discrete and continuous fits.

Empirical Bayes variance regularization

The empirical bayes regularization of the gene variance assumes that the precision (1/variance) is drawn from a gamma distribution with unknown parameters. These parameters are estimated by considering the distribution of sample variances over all genes. The procedure used for this is determined from ebayesControl, a named list with components 'method' (one of 'MOM' or 'MLE') and 'model' (one of 'H0' or 'H1') method MOM uses a method-of-moments estimator, while MLE using the marginal likelihood. H0 model estimates the precisions using the intercept alone in each gene, while H1 fits the full model specified by formula

See also

ZlmFit-class, ebayes, GLMlike-class, BayesGLMlike-class

Examples

data(vbetaFA)
zlmVbeta <- zlm(~ Stim.Condition, subset(vbetaFA, ncells==1)[1:10,])
#> 
#> Done!
slotNames(zlmVbeta)
#>  [1] "coefC"              "coefD"              "vcovC"             
#>  [4] "vcovD"              "LMlike"             "sca"               
#>  [7] "deviance"           "loglik"             "df.null"           
#> [10] "df.resid"           "dispersion"         "dispersionNoshrink"
#> [13] "priorDOF"           "priorVar"           "converged"         
#> [16] "hookOut"            "exprs_values"      
#A matrix of coefficients
coef(zlmVbeta, 'D')['CCL2',]
#>          (Intercept) Stim.ConditionUnstim 
#>           -3.8329217           -0.5108005 
#An array of covariance matrices
vcov(zlmVbeta, 'D')[,,'CCL2']
#>                      (Intercept) Stim.ConditionUnstim
#> (Intercept)            0.1439196           -0.1254838
#> Stim.ConditionUnstim  -0.1254838            0.9182853
waldTest(zlmVbeta, CoefficientHypothesis('Stim.ConditionUnstim'))
#> , , metric = lambda
#> 
#>         test.type
#> primerid         cont        disc    hurdle
#>   B3GAT1 0.9617702324 0.038250068  1.000020
#>   BAX    7.2211565188 3.645901878 10.867058
#>   BCL2   0.3766814067 2.202291748  2.578973
#>   CCL2   0.8414775522 0.284135226  1.125613
#>   CCL3             NA 3.548463195        NA
#>   CCL4             NA 2.012308210        NA
#>   CCL5   0.1746468538 0.862093478  1.036740
#>   CCR2   5.3383734489 2.308408187  7.646782
#>   CCR4   2.0437612666 0.003737042  2.047498
#>   CCR5   0.0005534473 2.811952306  2.812506
#> 
#> , , metric = df
#> 
#>         test.type
#> primerid cont disc hurdle
#>   B3GAT1    1    1      2
#>   BAX       1    1      2
#>   BCL2      1    1      2
#>   CCL2      1    1      2
#>   CCL3      1    1      2
#>   CCL4      1    1      2
#>   CCL5      1    1      2
#>   CCR2      1    1      2
#>   CCR4      1    1      2
#>   CCR5      1    1      2
#> 
#> , , metric = Pr(>Chisq)
#> 
#>         test.type
#> primerid        cont       disc      hurdle
#>   B3GAT1 0.326741272 0.84494185 0.606524503
#>   BAX    0.007204927 0.05620735 0.004367654
#>   BCL2   0.539384664 0.13780573 0.275412150
#>   CCL2   0.358974532 0.59400356 0.569608276
#>   CCL3            NA 0.05960063          NA
#>   CCL4            NA 0.15602777          NA
#>   CCL5   0.676014596 0.35315351 0.595490308
#>   CCR2   0.020860929 0.12867576 0.021853574
#>   CCR4   0.152831343 0.95125460 0.359245545
#>   CCR5   0.981231129 0.09356445 0.245059834
#> 

## Can also provide just a \code{data.frame} instead
data<- data.frame(x=rnorm(500), z=rbinom(500, 1, .3))
logit.y <- with(data, x*2 + z*2); mu.y <- with(data, 10+10*x+10*z + rnorm(500))
y <- (runif(500)<exp(logit.y)/(1+exp(logit.y)))*1
y[y>0] <- mu.y[y>0]
data$y <- y
fit <- zlm(y ~ x+z, data)
summary.glm(fit$disc)
#> 
#> Call:
#> NULL
#> 
#> Deviance Residuals: 
#>     Min       1Q   Median       3Q      Max  
#> -53.823   -1.196    1.011    1.206   10.097  
#> 
#> Coefficients:
#>             Estimate Std. Error z value Pr(>|z|)    
#> (Intercept) -0.04928    0.14244  -0.346    0.729    
#> x            2.14698    0.19496  11.012  < 2e-16 ***
#> z            2.19058    0.31755   6.898 5.26e-12 ***
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 
#> (Dispersion parameter for binomial family taken to be 1)
#> 
#>     Null deviance: 687.73  on 499  degrees of freedom
#> Residual deviance: 389.01  on 221  degrees of freedom
#> AIC: 395.01
#> 
#> Number of Fisher Scoring iterations: 6
#> 
summary.glm(fit$cont)
#> 
#> Call:
#> NULL
#> 
#> Deviance Residuals: 
#>      Min        1Q    Median        3Q       Max  
#> -2.60596  -0.70254   0.00196   0.77047   2.23471  
#> 
#> Coefficients:
#>             Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)  9.99422    0.09416  106.14   <2e-16 ***
#> x            9.96465    0.07646  130.32   <2e-16 ***
#> z           10.01528    0.12713   78.78   <2e-16 ***
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 
#> (Dispersion parameter for gaussian family taken to be 1.062237)
#> 
#>     Null deviance: 22136.29  on 275  degrees of freedom
#> Residual deviance:   289.99  on 273  degrees of freedom
#> AIC: 804.9
#> 
#> Number of Fisher Scoring iterations: 2
#>