---
title: Bug in nlme::getVarCov
date: '2016-08-10'
date-modified: '2024-12-30'
categories:
- Rstats
- programming
- hierarchical models
- nlme
code-tools: true
---
::: {.callout-tip title="Update December 30, 2024"}
The bug documented below was corrected a while ago, in version 3.1-150 of the nlme package. See [the bug report](https://bugs.r-project.org/show_bug.cgi?id=16744) and [the Changelog](https://cran.r-project.org/web/packages/nlme/ChangeLog) entry for 2020-09-24.
:::
```{=html}
<p>I have recently been working to ensure that <a href="https://github.com/jepusto/clubSandwich">my <code>clubSandwich</code> package</a> works correctly on fitted <code>lme</code> and <code>gls</code> models from the <code>nlme</code> package, which is one of the main R packages for fitting hierarchical linear models. In the course of digging around in the guts of <code>nlme</code>, I noticed a bug in the <code>getVarCov</code> function. The purpose of the function is to extract the estimated variance-covariance matrix of the errors from a fitted <code>lme</code> or <code>gls</code> model.</p>
<p>It seems that this function is sensitive to the order in which the input data are sorted. <a href="https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=16744">This bug report</a> noted the problem, but unfortunately their proposed fix doesn’t seem to solve the problem. In this post I’ll demonstrate the bug and a solution. (I’m posting this here because the R project’s bug reporting system is currently closed to people who were not registered as of early July, evidently due to some sort of spamming problem.)</p>
<div id="the-issue" class="level1">
<h1>The issue</h1>
<p>Here’s a simple demonstration of the problem. I’ll first fit a <code>gls</code> model with a heteroskedastic variance function and an AR(1) auto-correlation structure (no need to worry about the substance of the specification—we’re just worried about computation here) and then extract the variances for each of the units.</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="sourceCode cell-code" id="cb20"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"># Demonstrate the problem with gls model
library(nlme)
data(Ovary)
gls_raw <- gls(follicles ~ sin(2*pi*Time) + cos(2*pi*Time), data = Ovary,
correlation = corAR1(form = ~ 1 | Mare),
weights = varPower())
Mares <- levels(gls_raw$groups)
V_raw <- lapply(Mares, function(g) getVarCov(gls_raw, individual = g))</code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</details>
</div>
<p>Now I’ll repeat the process using the same data, but sorted in a different order</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="sourceCode cell-code" id="cb20"><pre class="sourceCode r code-with-copy"><code class="sourceCode r">Ovary_sorted <- Ovary[with(Ovary, order(Mare, Time)),]
gls_sorted <- update(gls_raw, data = Ovary_sorted)
V_sorted <- lapply(Mares, function(g) getVarCov(gls_sorted, individual = g))</code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</details>
</div>
<p>The variance component estimates are essentially equal:</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="sourceCode cell-code" id="cb21"><pre class="sourceCode r code-with-copy"><code class="sourceCode r">all.equal(gls_raw$modelStruct, gls_sorted$modelStruct)</code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</details>
<div class="cell-output cell-output-stdout">
<pre><code>[1] TRUE</code></pre>
</div>
</div>
<p>However, the extracted variance-covariance matrices are not:</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="sourceCode cell-code" id="cb21"><pre class="sourceCode r code-with-copy"><code class="sourceCode r">all.equal(V_raw, V_sorted)</code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</details>
<div class="cell-output cell-output-stdout">
<pre><code>[1] "Component 1: Mean relative difference: 0.03256"
[2] "Component 3: Mean relative difference: 0.05830791"
[3] "Component 4: Mean relative difference: 0.1142209"
[4] "Component 5: Mean relative difference: 0.03619692"
[5] "Component 6: Mean relative difference: 0.09260648"
[6] "Component 8: Mean relative difference: 0.08650327"
[7] "Component 9: Mean relative difference: 0.07627162"
[8] "Component 10: Mean relative difference: 0.018103"
[9] "Component 11: Mean relative difference: 0.1020658"</code></pre>
</div>
</div>
<p>Here’s the code of the relevant function:</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="sourceCode cell-code" id="cb21"><pre class="sourceCode r code-with-copy"><code class="sourceCode r">nlme:::getVarCov.gls</code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</details>
<div class="cell-output cell-output-stdout">
<pre><code>function (obj, individual = 1, ...)
{
S <- corMatrix(obj$modelStruct$corStruct)[[individual]]
if (!is.null(obj$modelStruct$varStruct)) {
ind <- obj$groups == individual
vw <- 1/varWeights(obj$modelStruct$varStruct)[ind]
}
else vw <- rep(1, nrow(S))
vars <- (obj$sigma * vw)^2
result <- t(S * sqrt(vars)) * sqrt(vars)
class(result) <- c("marginal", "VarCov")
attr(result, "group.levels") <- names(obj$groups)
result
}
<bytecode: 0x000000001bc39d00>
<environment: namespace:nlme></code></pre>
</div>
</div>
<p>The issue is in the 4th line of the body. <code>getVarCov.gls</code> assumes that <code>varWeights(obj$modelStruct$varStruct)</code> is sorted in the same order as <code>obj$groups</code>, which is not necessarily true. Instead, <code>varWeights</code> seem to return the weights sorted according to the grouping variable. For this example, that means that the <code>varWeights</code> will not depend on the order in which the groups are sorted.</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="sourceCode cell-code" id="cb21"><pre class="sourceCode r code-with-copy"><code class="sourceCode r">identical(gls_raw$groups, gls_sorted$groups)</code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</details>
<div class="cell-output cell-output-stdout">
<pre><code>[1] FALSE</code></pre>
</div>
</div>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="sourceCode cell-code" id="cb21"><pre class="sourceCode r code-with-copy"><code class="sourceCode r">identical(varWeights(gls_raw$modelStruct$varStruct),
varWeights(gls_sorted$modelStruct$varStruct))</code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</details>
<div class="cell-output cell-output-stdout">
<pre><code>[1] TRUE</code></pre>
</div>
</div>
</div>
<div id="fix-for-nlmegetvarcov.gls" class="level1">
<h1>Fix for <code>nlme:::getVarCov.gls</code></h1>
<p>I think this can be solved by either</p>
<ul>
<li>putting the <code>varWeights</code> back into the same order as the raw data or</li>
<li>sorting <code>obj$groups</code> before identifying the rows corresponding to the specified <code>individual</code>.</li>
</ul>
<p>Here’s a revised function that takes the second approach:</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="sourceCode cell-code" id="cb20"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"># proposed patch for getVarCov.gls
getVarCov_revised_gls <- function (obj, individual = 1, ...) {
S <- corMatrix(obj$modelStruct$corStruct)[[individual]]
if (!is.null(obj$modelStruct$varStruct)) {
ind <- sort(obj$groups) == individual
vw <- 1 / varWeights(obj$modelStruct$varStruct)[ind]
}
else vw <- rep(1, nrow(S))
vars <- (obj$sigma * vw)^2
result <- t(S * sqrt(vars)) * sqrt(vars)
class(result) <- c("marginal", "VarCov")
attr(result, "group.levels") <- names(obj$groups)
result
}</code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</details>
</div>
<p>Testing that it works correctly:</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="sourceCode cell-code" id="cb21"><pre class="sourceCode r code-with-copy"><code class="sourceCode r">V_raw <- lapply(Mares, function(g) getVarCov_revised_gls(gls_raw, individual = g))
V_sorted <- lapply(Mares, function(g) getVarCov_revised_gls(gls_sorted, individual = g))
all.equal(V_raw, V_sorted)</code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</details>
<div class="cell-output cell-output-stdout">
<pre><code>[1] TRUE</code></pre>
</div>
</div>
</div>
<div id="fix-for-nlmegetvarcov.lme" class="level1">
<h1>Fix for <code>nlme:::getVarCov.lme</code></h1>
<p>The same issue comes up in <code>getVarCov.lme</code>. Here’s the fix and verification:</p>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="sourceCode cell-code" id="cb21"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"># proposed patch for getVarCov.lme
getVarCov_revised_lme <- function (obj, individuals, type = c("random.effects", "conditional", "marginal"), ...) {
type <- match.arg(type)
if (any("nlme" == class(obj)))
stop("not implemented for \"nlme\" objects")
if (length(obj$group) > 1)
stop("not implemented for multiple levels of nesting")
sigma <- obj$sigma
D <- as.matrix(obj$modelStruct$reStruct[[1]]) * sigma^2
if (type == "random.effects") {
result <- D
}
else {
result <- list()
groups <- sort(obj$groups[[1]])
ugroups <- unique(groups)
if (missing(individuals))
individuals <- as.matrix(ugroups)[1, ]
if (is.numeric(individuals))
individuals <- ugroups[individuals]
for (individ in individuals) {
indx <- which(individ == ugroups)
if (!length(indx))
stop(gettextf("individual %s was not used in the fit",
sQuote(individ)), domain = NA)
if (is.na(indx))
stop(gettextf("individual %s was not used in the fit",
sQuote(individ)), domain = NA)
ind <- groups == individ
if (!is.null(obj$modelStruct$corStruct)) {
V <- corMatrix(obj$modelStruct$corStruct)[[as.character(individ)]]
}
else V <- diag(sum(ind))
if (!is.null(obj$modelStruct$varStruct))
sds <- 1/varWeights(obj$modelStruct$varStruct)[ind]
else sds <- rep(1, sum(ind))
sds <- obj$sigma * sds
cond.var <- t(V * sds) * sds
dimnames(cond.var) <- list(1:nrow(cond.var), 1:ncol(cond.var))
if (type == "conditional")
result[[as.character(individ)]] <- cond.var
else {
Z <- model.matrix(obj$modelStruct$reStruc, getData(obj))[ind,
, drop = FALSE]
result[[as.character(individ)]] <- cond.var +
Z %*% D %*% t(Z)
}
}
}
class(result) <- c(type, "VarCov")
attr(result, "group.levels") <- names(obj$groups)
result
}
lme_raw <- lme(follicles ~ sin(2*pi*Time) + cos(2*pi*Time),
random = ~ 1 | Mare,
correlation = corExp(form = ~ Time),
weights = varPower(),
data=Ovary)
lme_sorted <- update(lme_raw, data = Ovary_sorted)
all.equal(lme_raw$modelStruct, lme_sorted$modelStruct)</code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</details>
<div class="cell-output cell-output-stdout">
<pre><code>[1] TRUE</code></pre>
</div>
</div>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="sourceCode cell-code" id="cb21"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"># current getVarCov
V_raw <- lapply(Mares, function(g) getVarCov(lme_raw, individual = g, type = "marginal"))
V_sorted <- lapply(Mares, function(g) getVarCov(lme_sorted, individual = g, type = "marginal"))
all.equal(V_raw, V_sorted)</code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</details>
<div class="cell-output cell-output-stdout">
<pre><code>[1] "Component 1: Component 1: Mean relative difference: 0.003989954"
[2] "Component 3: Component 1: Mean relative difference: 0.003784181"
[3] "Component 4: Component 1: Mean relative difference: 0.003028662"
[4] "Component 5: Component 1: Mean relative difference: 0.0005997944"
[5] "Component 6: Component 1: Mean relative difference: 0.002350456"
[6] "Component 7: Component 1: Mean relative difference: 0.007103733"
[7] "Component 8: Component 1: Mean relative difference: 0.001887638"
[8] "Component 9: Component 1: Mean relative difference: 0.0009601843"
[9] "Component 10: Component 1: Mean relative difference: 0.004748783"
[10] "Component 11: Component 1: Mean relative difference: 0.001521097"</code></pre>
</div>
</div>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="sourceCode cell-code" id="cb21"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"># revised getVarCov
V_raw <- lapply(Mares, function(g) getVarCov_revised_lme(lme_raw, individual = g, type = "marginal"))
V_sorted <- lapply(Mares, function(g) getVarCov_revised_lme(lme_sorted, individual = g, type = "marginal"))
all.equal(V_raw, V_sorted)</code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</details>
<div class="cell-output cell-output-stdout">
<pre><code>[1] TRUE</code></pre>
</div>
</div>
</div>
<div id="session-info" class="level1">
<h1>Session info</h1>
<div class="cell">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="sourceCode cell-code" id="cb21"><pre class="sourceCode r code-with-copy"><code class="sourceCode r">sessionInfo()</code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</details>
<div class="cell-output cell-output-stdout">
<pre><code>R version 3.6.3 (2020-02-29)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 17763)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] nlme_3.1-144
loaded via a namespace (and not attached):
[1] Rcpp_1.0.4.6 bookdown_0.14 lattice_0.20-38 digest_0.6.25
[5] grid_3.6.3 magrittr_1.5 evaluate_0.14 blogdown_0.18
[9] rlang_0.4.5 stringi_1.4.3 rmarkdown_2.1 tools_3.6.3
[13] stringr_1.4.0 xfun_0.12 yaml_2.2.0 compiler_3.6.3
[17] htmltools_0.4.0 knitr_1.28</code></pre>
</div>
</div>
</div>
```