After each gene is fit, a hook function can optionally be run and the output saved.
This allows extended computations to be done using the fitted model, without keeping it in memory.
Here this is used to calculate various residuals, though in some cases they can be done using only the information contained in the ZlmFit
-class.
collectResiduals(x, sca, newLayerName = "Residuals")
discrete_residuals_hook(x)
continuous_residuals_hook(x)
combined_residuals_hook(x)
deviance_residuals_hook(x)
fitted_phat(x)
partialScore(x, effectRegex)
ZlmFit
-class
SingleCellAssay
object to which the residuals should be added
character
name of the assay layer
a regular expression naming columns of the design corresponding to \(Z_0\). Generally these should be the treatment effects of interest.
copy of sca
with new layer
discrete_residuals_hook
: Hook to get the discrete residuals, ie, difference between expected probability of expression and observed
continuous_residuals_hook
: Hook to get the continuous residuals, ie, residuals for conditionally positive observations. If an observation is zero, it's residual is defined to be zero as well.
combined_residuals_hook
: Hook to get the combined residuals, ie, Y-E(U)*E(V)
deviance_residuals_hook
: Standardized deviance residuals hook. Computes the sum of the standardized deviance residuals for the discrete and continuous models (scaled to have unit variance). If the observation is zero then only the discrete component is used.
fitted_phat
: Hook to return p_hat, the predicted probability of expression.
partialScore
: Compute \(Y_i-E(V_i|X_i, Z_0)E(U|X_i, Z_0)\), where \(Z_0\) is a treatment effect (being left in) and \(X_i\) is a nuisance effect (being regressed out).
Each component of the model contributes several flavors of residual, which can be combined in various fashions. The discrete residual can be on the response scale (thus subtracting the predicted probability of expression from the 0/1 expression value). Or it can be a deviance residual, revealing something about the log-likelihood.
It's also possible to consider partial residuals, in which the contribution of a particular covariate is added back into the model.
zlm
data(vbetaFA)
svbeta <- subset(vbetaFA, ncells==1)
svbeta <- svbeta[freq(svbeta)>.4,]
window <- function(x1) lapply(assays(x1), function(x2) x2[1:3, 1:6])
#total residuals of the response
z1 <- zlm(~ Stim.Condition, svbeta, hook=discrete_residuals_hook)
#>
#> Done!
window(collectResiduals(z1, svbeta))
#> $Et
#> wellKey
#> primerid Sub01 1 A01 Sub01 1 A02 Sub01 1 A03 Sub01 1 A04 Sub01 1 A05
#> CCR7 0.00000 18.32835 18.72408 0.00000 0.00000
#> CD28 14.36047 0.00000 16.50232 15.36025 13.85367
#> CD3d 14.39789 15.71954 17.08992 16.37704 17.38282
#> wellKey
#> primerid Sub01 1 A06
#> CCR7 0.00000
#> CD28 0.00000
#> CD3d 15.54438
#>
#> $Residuals
#> Sub01 1 A01 Sub01 1 A02 Sub01 1 A03 Sub01 1 A04 Sub01 1 A05 Sub01 1 A06
#> CCR7 -0.5009035 0.4990965 0.4990965 -0.5009035 -0.5009035 -0.5009035
#> CD28 0.4962616 -0.5037384 0.4962616 0.4962616 0.4962616 -0.5037384
#> CD3d 0.3473145 0.3473145 0.3473145 0.3473145 0.3473145 0.3473145
#>
z2 <- zlm(~ Stim.Condition, svbeta, hook=continuous_residuals_hook)
#>
#> Done!
window(collectResiduals(z2, svbeta))
#> $Et
#> wellKey
#> primerid Sub01 1 A01 Sub01 1 A02 Sub01 1 A03 Sub01 1 A04 Sub01 1 A05
#> CCR7 0.00000 18.32835 18.72408 0.00000 0.00000
#> CD28 14.36047 0.00000 16.50232 15.36025 13.85367
#> CD3d 14.39789 15.71954 17.08992 16.37704 17.38282
#> wellKey
#> primerid Sub01 1 A06
#> CCR7 0.00000
#> CD28 0.00000
#> CD3d 15.54438
#>
#> $Residuals
#> Sub01 1 A01 Sub01 1 A02 Sub01 1 A03 Sub01 1 A04 Sub01 1 A05 Sub01 1 A06
#> CCR7 0.000000 0.1679555 0.5636873 0.0000000 0.0000000 0.000000
#> CD28 -1.338277 0.0000000 0.8035785 -0.3384889 -1.8450766 0.000000
#> CD3d -2.235474 -0.9138159 0.4565626 -0.2563244 0.7494639 -1.088981
#>
z3 <- zlm(~ Stim.Condition, svbeta, hook=combined_residuals_hook)
#>
#> Done!
window(collectResiduals(z3, svbeta))
#> $Et
#> wellKey
#> primerid Sub01 1 A01 Sub01 1 A02 Sub01 1 A03 Sub01 1 A04 Sub01 1 A05
#> CCR7 0.00000 18.32835 18.72408 0.00000 0.00000
#> CD28 14.36047 0.00000 16.50232 15.36025 13.85367
#> CD3d 14.39789 15.71954 17.08992 16.37704 17.38282
#> wellKey
#> primerid Sub01 1 A06
#> CCR7 0.00000
#> CD28 0.00000
#> CD3d 15.54438
#>
#> $Residuals
#> Sub01 1 A01 Sub01 1 A02 Sub01 1 A03 Sub01 1 A04 Sub01 1 A05 Sub01 1 A06
#> CCR7 -9.096604 9.231742 9.627474 -9.096604 -9.096604 -9.096604
#> CD28 6.452407 -7.908060 8.594263 7.452195 5.945608 -7.908060
#> CD3d 3.541534 4.863192 6.233570 5.520683 6.526472 4.688027
#>
#partial residuals
colData(svbeta)$ngeneson <- colMeans(assay(svbeta)>0)
z5 <- zlm(~ Stim.Condition + ngeneson, svbeta)
#>
#> Done!
partialScore(z5, 'Stim.Condition')